source: uri/README.rst@ 232

Last change on this file since 232 was 230, checked in by wouter, 4 years ago

#91 clone https://pypi.org/project/rfc3986/

File size: 5.5 KB
Line 
1rfc3986
2=======
3
4A Python implementation of `RFC 3986`_ including validation and authority
5parsing.
6
7Installation
8------------
9
10Use pip to install ``rfc3986`` like so::
11
12 pip install rfc3986
13
14License
15-------
16
17`Apache License Version 2.0`_
18
19Example Usage
20-------------
21
22The following are the two most common use cases envisioned for ``rfc3986``.
23
24Replacing ``urlparse``
25``````````````````````
26
27To parse a URI and receive something very similar to the standard library's
28``urllib.parse.urlparse``
29
30.. code-block:: python
31
32 from rfc3986 import urlparse
33
34 ssh = urlparse('ssh://user@git.openstack.org:29418/openstack/glance.git')
35 print(ssh.scheme) # => ssh
36 print(ssh.userinfo) # => user
37 print(ssh.params) # => None
38 print(ssh.port) # => 29418
39
40To create a copy of it with new pieces you can use ``copy_with``:
41
42.. code-block:: python
43
44 new_ssh = ssh.copy_with(
45 scheme='https'
46 userinfo='',
47 port=443,
48 path='/openstack/glance'
49 )
50 print(new_ssh.scheme) # => https
51 print(new_ssh.userinfo) # => None
52 # etc.
53
54Strictly Parsing a URI and Applying Validation
55``````````````````````````````````````````````
56
57To parse a URI into a convenient named tuple, you can simply:
58
59.. code-block:: python
60
61 from rfc3986 import uri_reference
62
63 example = uri_reference('http://example.com')
64 email = uri_reference('mailto:user@domain.com')
65 ssh = uri_reference('ssh://user@git.openstack.org:29418/openstack/keystone.git')
66
67With a parsed URI you can access data about the components:
68
69.. code-block:: python
70
71 print(example.scheme) # => http
72 print(email.path) # => user@domain.com
73 print(ssh.userinfo) # => user
74 print(ssh.host) # => git.openstack.org
75 print(ssh.port) # => 29418
76
77It can also parse URIs with unicode present:
78
79.. code-block:: python
80
81 uni = uri_reference(b'http://httpbin.org/get?utf8=\xe2\x98\x83') # ☃
82 print(uni.query) # utf8=%E2%98%83
83
84With a parsed URI you can also validate it:
85
86.. code-block:: python
87
88 if ssh.is_valid():
89 subprocess.call(['git', 'clone', ssh.unsplit()])
90
91You can also take a parsed URI and normalize it:
92
93.. code-block:: python
94
95 mangled = uri_reference('hTTp://exAMPLe.COM')
96 print(mangled.scheme) # => hTTp
97 print(mangled.authority) # => exAMPLe.COM
98
99 normal = mangled.normalize()
100 print(normal.scheme) # => http
101 print(mangled.authority) # => example.com
102
103But these two URIs are (functionally) equivalent:
104
105.. code-block:: python
106
107 if normal == mangled:
108 webbrowser.open(normal.unsplit())
109
110Your paths, queries, and fragments are safe with us though:
111
112.. code-block:: python
113
114 mangled = uri_reference('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth')
115 normal = mangled.normalize()
116 assert normal == 'hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth'
117 assert normal == 'http://example.com/Some/reallY/biZZare/pAth'
118 assert normal != 'http://example.com/some/really/bizzare/path'
119
120If you do not actually need a real reference object and just want to normalize
121your URI:
122
123.. code-block:: python
124
125 from rfc3986 import normalize_uri
126
127 assert (normalize_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') ==
128 'http://example.com/Some/reallY/biZZare/pAth')
129
130You can also very simply validate a URI:
131
132.. code-block:: python
133
134 from rfc3986 import is_valid_uri
135
136 assert is_valid_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth')
137
138Requiring Components
139~~~~~~~~~~~~~~~~~~~~
140
141You can validate that a particular string is a valid URI and require
142independent components:
143
144.. code-block:: python
145
146 from rfc3986 import is_valid_uri
147
148 assert is_valid_uri('http://localhost:8774/v2/resource',
149 require_scheme=True,
150 require_authority=True,
151 require_path=True)
152
153 # Assert that a mailto URI is invalid if you require an authority
154 # component
155 assert is_valid_uri('mailto:user@example.com', require_authority=True) is False
156
157If you have an instance of a ``URIReference``, you can pass the same arguments
158to ``URIReference#is_valid``, e.g.,
159
160.. code-block:: python
161
162 from rfc3986 import uri_reference
163
164 http = uri_reference('http://localhost:8774/v2/resource')
165 assert uri.is_valid(require_scheme=True,
166 require_authority=True,
167 require_path=True)
168
169 # Assert that a mailto URI is invalid if you require an authority
170 # component
171 mailto = uri_reference('mailto:user@example.com')
172 assert uri.is_valid(require_authority=True) is False
173
174Alternatives
175------------
176
177- `rfc3987 <https://pypi.python.org/pypi/rfc3987/1.3.4>`_
178
179 This is a direct competitor to this library, with extra features,
180 licensed under the GPL.
181
182- `uritools <https://pypi.python.org/pypi/uritools/0.5.1>`_
183
184 This can parse URIs in the manner of RFC 3986 but provides no validation and
185 only recently added Python 3 support.
186
187- Standard library's `urlparse`/`urllib.parse`
188
189 The functions in these libraries can only split a URI (valid or not) and
190 provide no validation.
191
192Contributing
193------------
194
195This project follows and enforces the Python Software Foundation's `Code of
196Conduct <https://www.python.org/psf/codeofconduct/>`_.
197
198If you would like to contribute but do not have a bug or feature in mind, feel
199free to email Ian and find out how you can help.
200
201The git repository for this project is maintained at
202https://github.com/python-hyper/rfc3986
203
204.. _RFC 3986: http://tools.ietf.org/html/rfc3986
205.. _Apache License Version 2.0: https://www.apache.org/licenses/LICENSE-2.0
Note: See TracBrowser for help on using the repository browser.