This is a slightly modified version of ​​https://pypi.org/project/rfc3986/ The main reason is that I need a simple URI class. THe directory hierarchy was shuffled to make it work with setup.py and a new uri.py class was added to the root of the project. A test file uriTest.py was added to tests folder. The doc dir was removed because setup.py seems insisting to including it in the distri which is not what we want. pip install https://tracinsy.ewi.tudelft.nl/pubtrac/Utilities/export/237/uri/dist/uri-1.0.0.tar.gz Below the original readme from rfc3986. You don't need it and can ignore it. rfc3986 ======= A Python implementation of `RFC 3986`_ including validation and authority parsing. Installation ------------ Use pip to install ``rfc3986`` like so:: pip install rfc3986 License ------- `Apache License Version 2.0`_ Example Usage ------------- The following are the two most common use cases envisioned for ``rfc3986``. Replacing ``urlparse`` `````````````````````` To parse a URI and receive something very similar to the standard library's ``urllib.parse.urlparse`` .. code-block:: python from rfc3986 import urlparse ssh = urlparse('ssh://user@git.openstack.org:29418/openstack/glance.git') print(ssh.scheme) # => ssh print(ssh.userinfo) # => user print(ssh.params) # => None print(ssh.port) # => 29418 To create a copy of it with new pieces you can use ``copy_with``: .. code-block:: python new_ssh = ssh.copy_with( scheme='https' userinfo='', port=443, path='/openstack/glance' ) print(new_ssh.scheme) # => https print(new_ssh.userinfo) # => None # etc. Strictly Parsing a URI and Applying Validation `````````````````````````````````````````````` To parse a URI into a convenient named tuple, you can simply: .. code-block:: python from rfc3986 import uri_reference example = uri_reference('http://example.com') email = uri_reference('mailto:user@domain.com') ssh = uri_reference('ssh://user@git.openstack.org:29418/openstack/keystone.git') With a parsed URI you can access data about the components: .. code-block:: python print(example.scheme) # => http print(email.path) # => user@domain.com print(ssh.userinfo) # => user print(ssh.host) # => git.openstack.org print(ssh.port) # => 29418 It can also parse URIs with unicode present: .. code-block:: python uni = uri_reference(b'http://httpbin.org/get?utf8=\xe2\x98\x83') # ☃ print(uni.query) # utf8=%E2%98%83 With a parsed URI you can also validate it: .. code-block:: python if ssh.is_valid(): subprocess.call(['git', 'clone', ssh.unsplit()]) You can also take a parsed URI and normalize it: .. code-block:: python mangled = uri_reference('hTTp://exAMPLe.COM') print(mangled.scheme) # => hTTp print(mangled.authority) # => exAMPLe.COM normal = mangled.normalize() print(normal.scheme) # => http print(mangled.authority) # => example.com But these two URIs are (functionally) equivalent: .. code-block:: python if normal == mangled: webbrowser.open(normal.unsplit()) Your paths, queries, and fragments are safe with us though: .. code-block:: python mangled = uri_reference('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') normal = mangled.normalize() assert normal == 'hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth' assert normal == 'http://example.com/Some/reallY/biZZare/pAth' assert normal != 'http://example.com/some/really/bizzare/path' If you do not actually need a real reference object and just want to normalize your URI: .. code-block:: python from rfc3986 import normalize_uri assert (normalize_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') == 'http://example.com/Some/reallY/biZZare/pAth') You can also very simply validate a URI: .. code-block:: python from rfc3986 import is_valid_uri assert is_valid_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') Requiring Components ~~~~~~~~~~~~~~~~~~~~ You can validate that a particular string is a valid URI and require independent components: .. code-block:: python from rfc3986 import is_valid_uri assert is_valid_uri('http://localhost:8774/v2/resource', require_scheme=True, require_authority=True, require_path=True) # Assert that a mailto URI is invalid if you require an authority # component assert is_valid_uri('mailto:user@example.com', require_authority=True) is False If you have an instance of a ``URIReference``, you can pass the same arguments to ``URIReference#is_valid``, e.g., .. code-block:: python from rfc3986 import uri_reference http = uri_reference('http://localhost:8774/v2/resource') assert uri.is_valid(require_scheme=True, require_authority=True, require_path=True) # Assert that a mailto URI is invalid if you require an authority # component mailto = uri_reference('mailto:user@example.com') assert uri.is_valid(require_authority=True) is False Alternatives ------------ - `rfc3987 `_ This is a direct competitor to this library, with extra features, licensed under the GPL. - `uritools `_ This can parse URIs in the manner of RFC 3986 but provides no validation and only recently added Python 3 support. - Standard library's `urlparse`/`urllib.parse` The functions in these libraries can only split a URI (valid or not) and provide no validation. Contributing ------------ This project follows and enforces the Python Software Foundation's `Code of Conduct `_. If you would like to contribute but do not have a bug or feature in mind, feel free to email Ian and find out how you can help. The git repository for this project is maintained at https://github.com/python-hyper/rfc3986 .. _RFC 3986: http://tools.ietf.org/html/rfc3986 .. _Apache License Version 2.0: https://www.apache.org/licenses/LICENSE-2.0