1 | This is a slightly modified version of https://pypi.org/project/rfc3986/
2 |
3 | The main reason is that I need a simple URI class.
4 |
5 | THe directory hierarchy was shuffled to make it work with setup.py
6 | and a new uri.py class was added to the root of the project.
7 | A test file uriTest.py was added to tests folder.
8 | The doc dir was removed because setup.py seems insisting to including
9 | it in the distri which is not what we want.
10 |
11 |
12 | rfc3986
13 | =======
14 |
15 | A Python implementation of `RFC 3986`_ including validation and authority
16 | parsing.
17 |
18 | Installation
19 | ------------
20 |
21 | Use pip to install ``rfc3986`` like so::
22 |
23 | pip install rfc3986
24 |
25 | License
26 | -------
27 |
28 | `Apache License Version 2.0`_
29 |
30 | Example Usage
31 | -------------
32 |
33 | The following are the two most common use cases envisioned for ``rfc3986``.
34 |
35 | Replacing ``urlparse``
36 | ``````````````````````
37 |
38 | To parse a URI and receive something very similar to the standard library's
39 | ``urllib.parse.urlparse``
40 |
41 | .. code-block:: python
42 |
43 | from rfc3986 import urlparse
44 |
45 | ssh = urlparse('ssh://user@git.openstack.org:29418/openstack/glance.git')
46 | print(ssh.scheme) # => ssh
47 | print(ssh.userinfo) # => user
48 | print(ssh.params) # => None
49 | print(ssh.port) # => 29418
50 |
51 | To create a copy of it with new pieces you can use ``copy_with``:
52 |
53 | .. code-block:: python
54 |
55 | new_ssh = ssh.copy_with(
56 | scheme='https'
57 | userinfo='',
58 | port=443,
59 | path='/openstack/glance'
60 | )
61 | print(new_ssh.scheme) # => https
62 | print(new_ssh.userinfo) # => None
63 | # etc.
64 |
65 | Strictly Parsing a URI and Applying Validation
66 | ``````````````````````````````````````````````
67 |
68 | To parse a URI into a convenient named tuple, you can simply:
69 |
70 | .. code-block:: python
71 |
72 | from rfc3986 import uri_reference
73 |
74 | example = uri_reference('http://example.com')
75 | email = uri_reference('mailto:user@domain.com')
76 | ssh = uri_reference('ssh://user@git.openstack.org:29418/openstack/keystone.git')
77 |
78 | With a parsed URI you can access data about the components:
79 |
80 | .. code-block:: python
81 |
82 | print(example.scheme) # => http
83 | print(email.path) # => user@domain.com
84 | print(ssh.userinfo) # => user
85 | print(ssh.host) # => git.openstack.org
86 | print(ssh.port) # => 29418
87 |
88 | It can also parse URIs with unicode present:
89 |
90 | .. code-block:: python
91 |
92 | uni = uri_reference(b'http://httpbin.org/get?utf8=\xe2\x98\x83') # ☃
93 | print(uni.query) # utf8=%E2%98%83
94 |
95 | With a parsed URI you can also validate it:
96 |
97 | .. code-block:: python
98 |
99 | if ssh.is_valid():
100 | subprocess.call(['git', 'clone', ssh.unsplit()])
101 |
102 | You can also take a parsed URI and normalize it:
103 |
104 | .. code-block:: python
105 |
106 | mangled = uri_reference('hTTp://exAMPLe.COM')
107 | print(mangled.scheme) # => hTTp
108 | print(mangled.authority) # => exAMPLe.COM
109 |
110 | normal = mangled.normalize()
111 | print(normal.scheme) # => http
112 | print(mangled.authority) # => example.com
113 |
114 | But these two URIs are (functionally) equivalent:
115 |
116 | .. code-block:: python
117 |
118 | if normal == mangled:
119 | webbrowser.open(normal.unsplit())
120 |
121 | Your paths, queries, and fragments are safe with us though:
122 |
123 | .. code-block:: python
124 |
125 | mangled = uri_reference('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth')
126 | normal = mangled.normalize()
127 | assert normal == 'hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth'
128 | assert normal == 'http://example.com/Some/reallY/biZZare/pAth'
129 | assert normal != 'http://example.com/some/really/bizzare/path'
130 |
131 | If you do not actually need a real reference object and just want to normalize
132 | your URI:
133 |
134 | .. code-block:: python
135 |
136 | from rfc3986 import normalize_uri
137 |
138 | assert (normalize_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') ==
139 | 'http://example.com/Some/reallY/biZZare/pAth')
140 |
141 | You can also very simply validate a URI:
142 |
143 | .. code-block:: python
144 |
145 | from rfc3986 import is_valid_uri
146 |
147 | assert is_valid_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth')
148 |
149 | Requiring Components
150 | ~~~~~~~~~~~~~~~~~~~~
151 |
152 | You can validate that a particular string is a valid URI and require
153 | independent components:
154 |
155 | .. code-block:: python
156 |
157 | from rfc3986 import is_valid_uri
158 |
159 | assert is_valid_uri('http://localhost:8774/v2/resource',
160 | require_scheme=True,
161 | require_authority=True,
162 | require_path=True)
163 |
164 | # Assert that a mailto URI is invalid if you require an authority
165 | # component
166 | assert is_valid_uri('mailto:user@example.com', require_authority=True) is False
167 |
168 | If you have an instance of a ``URIReference``, you can pass the same arguments
169 | to ``URIReference#is_valid``, e.g.,
170 |
171 | .. code-block:: python
172 |
173 | from rfc3986 import uri_reference
174 |
175 | http = uri_reference('http://localhost:8774/v2/resource')
176 | assert uri.is_valid(require_scheme=True,
177 | require_authority=True,
178 | require_path=True)
179 |
180 | # Assert that a mailto URI is invalid if you require an authority
181 | # component
182 | mailto = uri_reference('mailto:user@example.com')
183 | assert uri.is_valid(require_authority=True) is False
184 |
185 | Alternatives
186 | ------------
187 |
188 | - `rfc3987 <https://pypi.python.org/pypi/rfc3987/1.3.4>`_
189 |
190 | This is a direct competitor to this library, with extra features,
191 | licensed under the GPL.
192 |
193 | - `uritools <https://pypi.python.org/pypi/uritools/0.5.1>`_
194 |
195 | This can parse URIs in the manner of RFC 3986 but provides no validation and
196 | only recently added Python 3 support.
197 |
198 | - Standard library's `urlparse`/`urllib.parse`
199 |
200 | The functions in these libraries can only split a URI (valid or not) and
201 | provide no validation.
202 |
203 | Contributing
204 | ------------
205 |
206 | This project follows and enforces the Python Software Foundation's `Code of
207 | Conduct <https://www.python.org/psf/codeofconduct/>`_.
208 |
209 | If you would like to contribute but do not have a bug or feature in mind, feel
210 | free to email Ian and find out how you can help.
211 |
212 | The git repository for this project is maintained at
213 | https://github.com/python-hyper/rfc3986
214 |
215 | .. _RFC 3986: http://tools.ietf.org/html/rfc3986
216 | .. _Apache License Version 2.0: https://www.apache.org/licenses/LICENSE-2.0