1 | rfc3986
|
---|
2 | =======
|
---|
3 |
|
---|
4 | A Python implementation of `RFC 3986`_ including validation and authority
|
---|
5 | parsing.
|
---|
6 |
|
---|
7 | Installation
|
---|
8 | ------------
|
---|
9 |
|
---|
10 | Use pip to install ``rfc3986`` like so::
|
---|
11 |
|
---|
12 | pip install rfc3986
|
---|
13 |
|
---|
14 | License
|
---|
15 | -------
|
---|
16 |
|
---|
17 | `Apache License Version 2.0`_
|
---|
18 |
|
---|
19 | Example Usage
|
---|
20 | -------------
|
---|
21 |
|
---|
22 | The following are the two most common use cases envisioned for ``rfc3986``.
|
---|
23 |
|
---|
24 | Replacing ``urlparse``
|
---|
25 | ``````````````````````
|
---|
26 |
|
---|
27 | To parse a URI and receive something very similar to the standard library's
|
---|
28 | ``urllib.parse.urlparse``
|
---|
29 |
|
---|
30 | .. code-block:: python
|
---|
31 |
|
---|
32 | from rfc3986 import urlparse
|
---|
33 |
|
---|
34 | ssh = urlparse('ssh://user@git.openstack.org:29418/openstack/glance.git')
|
---|
35 | print(ssh.scheme) # => ssh
|
---|
36 | print(ssh.userinfo) # => user
|
---|
37 | print(ssh.params) # => None
|
---|
38 | print(ssh.port) # => 29418
|
---|
39 |
|
---|
40 | To create a copy of it with new pieces you can use ``copy_with``:
|
---|
41 |
|
---|
42 | .. code-block:: python
|
---|
43 |
|
---|
44 | new_ssh = ssh.copy_with(
|
---|
45 | scheme='https'
|
---|
46 | userinfo='',
|
---|
47 | port=443,
|
---|
48 | path='/openstack/glance'
|
---|
49 | )
|
---|
50 | print(new_ssh.scheme) # => https
|
---|
51 | print(new_ssh.userinfo) # => None
|
---|
52 | # etc.
|
---|
53 |
|
---|
54 | Strictly Parsing a URI and Applying Validation
|
---|
55 | ``````````````````````````````````````````````
|
---|
56 |
|
---|
57 | To parse a URI into a convenient named tuple, you can simply:
|
---|
58 |
|
---|
59 | .. code-block:: python
|
---|
60 |
|
---|
61 | from rfc3986 import uri_reference
|
---|
62 |
|
---|
63 | example = uri_reference('http://example.com')
|
---|
64 | email = uri_reference('mailto:user@domain.com')
|
---|
65 | ssh = uri_reference('ssh://user@git.openstack.org:29418/openstack/keystone.git')
|
---|
66 |
|
---|
67 | With a parsed URI you can access data about the components:
|
---|
68 |
|
---|
69 | .. code-block:: python
|
---|
70 |
|
---|
71 | print(example.scheme) # => http
|
---|
72 | print(email.path) # => user@domain.com
|
---|
73 | print(ssh.userinfo) # => user
|
---|
74 | print(ssh.host) # => git.openstack.org
|
---|
75 | print(ssh.port) # => 29418
|
---|
76 |
|
---|
77 | It can also parse URIs with unicode present:
|
---|
78 |
|
---|
79 | .. code-block:: python
|
---|
80 |
|
---|
81 | uni = uri_reference(b'http://httpbin.org/get?utf8=\xe2\x98\x83') # ☃
|
---|
82 | print(uni.query) # utf8=%E2%98%83
|
---|
83 |
|
---|
84 | With a parsed URI you can also validate it:
|
---|
85 |
|
---|
86 | .. code-block:: python
|
---|
87 |
|
---|
88 | if ssh.is_valid():
|
---|
89 | subprocess.call(['git', 'clone', ssh.unsplit()])
|
---|
90 |
|
---|
91 | You can also take a parsed URI and normalize it:
|
---|
92 |
|
---|
93 | .. code-block:: python
|
---|
94 |
|
---|
95 | mangled = uri_reference('hTTp://exAMPLe.COM')
|
---|
96 | print(mangled.scheme) # => hTTp
|
---|
97 | print(mangled.authority) # => exAMPLe.COM
|
---|
98 |
|
---|
99 | normal = mangled.normalize()
|
---|
100 | print(normal.scheme) # => http
|
---|
101 | print(mangled.authority) # => example.com
|
---|
102 |
|
---|
103 | But these two URIs are (functionally) equivalent:
|
---|
104 |
|
---|
105 | .. code-block:: python
|
---|
106 |
|
---|
107 | if normal == mangled:
|
---|
108 | webbrowser.open(normal.unsplit())
|
---|
109 |
|
---|
110 | Your paths, queries, and fragments are safe with us though:
|
---|
111 |
|
---|
112 | .. code-block:: python
|
---|
113 |
|
---|
114 | mangled = uri_reference('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth')
|
---|
115 | normal = mangled.normalize()
|
---|
116 | assert normal == 'hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth'
|
---|
117 | assert normal == 'http://example.com/Some/reallY/biZZare/pAth'
|
---|
118 | assert normal != 'http://example.com/some/really/bizzare/path'
|
---|
119 |
|
---|
120 | If you do not actually need a real reference object and just want to normalize
|
---|
121 | your URI:
|
---|
122 |
|
---|
123 | .. code-block:: python
|
---|
124 |
|
---|
125 | from rfc3986 import normalize_uri
|
---|
126 |
|
---|
127 | assert (normalize_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') ==
|
---|
128 | 'http://example.com/Some/reallY/biZZare/pAth')
|
---|
129 |
|
---|
130 | You can also very simply validate a URI:
|
---|
131 |
|
---|
132 | .. code-block:: python
|
---|
133 |
|
---|
134 | from rfc3986 import is_valid_uri
|
---|
135 |
|
---|
136 | assert is_valid_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth')
|
---|
137 |
|
---|
138 | Requiring Components
|
---|
139 | ~~~~~~~~~~~~~~~~~~~~
|
---|
140 |
|
---|
141 | You can validate that a particular string is a valid URI and require
|
---|
142 | independent components:
|
---|
143 |
|
---|
144 | .. code-block:: python
|
---|
145 |
|
---|
146 | from rfc3986 import is_valid_uri
|
---|
147 |
|
---|
148 | assert is_valid_uri('http://localhost:8774/v2/resource',
|
---|
149 | require_scheme=True,
|
---|
150 | require_authority=True,
|
---|
151 | require_path=True)
|
---|
152 |
|
---|
153 | # Assert that a mailto URI is invalid if you require an authority
|
---|
154 | # component
|
---|
155 | assert is_valid_uri('mailto:user@example.com', require_authority=True) is False
|
---|
156 |
|
---|
157 | If you have an instance of a ``URIReference``, you can pass the same arguments
|
---|
158 | to ``URIReference#is_valid``, e.g.,
|
---|
159 |
|
---|
160 | .. code-block:: python
|
---|
161 |
|
---|
162 | from rfc3986 import uri_reference
|
---|
163 |
|
---|
164 | http = uri_reference('http://localhost:8774/v2/resource')
|
---|
165 | assert uri.is_valid(require_scheme=True,
|
---|
166 | require_authority=True,
|
---|
167 | require_path=True)
|
---|
168 |
|
---|
169 | # Assert that a mailto URI is invalid if you require an authority
|
---|
170 | # component
|
---|
171 | mailto = uri_reference('mailto:user@example.com')
|
---|
172 | assert uri.is_valid(require_authority=True) is False
|
---|
173 |
|
---|
174 | Alternatives
|
---|
175 | ------------
|
---|
176 |
|
---|
177 | - `rfc3987 <https://pypi.python.org/pypi/rfc3987/1.3.4>`_
|
---|
178 |
|
---|
179 | This is a direct competitor to this library, with extra features,
|
---|
180 | licensed under the GPL.
|
---|
181 |
|
---|
182 | - `uritools <https://pypi.python.org/pypi/uritools/0.5.1>`_
|
---|
183 |
|
---|
184 | This can parse URIs in the manner of RFC 3986 but provides no validation and
|
---|
185 | only recently added Python 3 support.
|
---|
186 |
|
---|
187 | - Standard library's `urlparse`/`urllib.parse`
|
---|
188 |
|
---|
189 | The functions in these libraries can only split a URI (valid or not) and
|
---|
190 | provide no validation.
|
---|
191 |
|
---|
192 | Contributing
|
---|
193 | ------------
|
---|
194 |
|
---|
195 | This project follows and enforces the Python Software Foundation's `Code of
|
---|
196 | Conduct <https://www.python.org/psf/codeofconduct/>`_.
|
---|
197 |
|
---|
198 | If you would like to contribute but do not have a bug or feature in mind, feel
|
---|
199 | free to email Ian and find out how you can help.
|
---|
200 |
|
---|
201 | The git repository for this project is maintained at
|
---|
202 | https://github.com/python-hyper/rfc3986
|
---|
203 |
|
---|
204 | .. _RFC 3986: http://tools.ietf.org/html/rfc3986
|
---|
205 | .. _Apache License Version 2.0: https://www.apache.org/licenses/LICENSE-2.0
|
---|