Building URIs¶
Constructing URLs often seems simple. There are some problems with concatenating strings to build a URL:
Certain parts of the URL disallow certain characters
Formatting some parts of the URL is tricky and doing it manually isn’t fun
To make the experience better rfc3986
provides the
URIBuilder
class to generate valid
URIReference
instances. The
URIBuilder
class will handle ensuring that each
component is normalized and safe for real world use.
Example Usage¶
Note
All of the methods on a URIBuilder
are
chainable (except finalize()
).
Let’s build a basic URL with just a scheme and host. First we create an
instance of URIBuilder
. Then we call
add_scheme()
and
add_host()
with the scheme and host
we want to include in the URL. Then we convert our builder object into
a URIReference
and call
unsplit()
.
>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'github.com'
... ).finalize().unsplit())
https://github.com
It is possible to update an existing URI by constructing a builder from an
instance of URIReference
or a textual representation:
>>> from rfc3986 import builder
>>> print(builder.URIBuilder.from_uri("http://github.com").add_scheme(
... 'https'
... ).finalize().unsplit())
https://github.com
Each time you invoke a method, you get a new instance of a
URIBuilder
class so you can build several different
URLs from one base instance.
>>> from rfc3986 import builder
>>> github_builder = builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'api.github.com'
... )
>>> print(github_builder.add_path(
... '/users/sigmavirus24'
... ).finalize().unsplit())
https://api.github.com/users/sigmavirus24
>>> print(github_builder.add_path(
... '/repos/sigmavirus24/rfc3986'
... ).finalize().unsplit())
https://api.github.com/repos/sigmavirus24/rfc3986
rfc3986
makes adding authentication credentials convenient. It takes care of
making the credentials URL safe. There are some characters someone might want
to include in a URL that are not safe for the authority component of a URL.
>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'api.github.com'
... ).add_credentials(
... username='us3r',
... password='p@ssw0rd',
... ).finalize().unsplit())
https://us3r:p%40ssw0rd@api.github.com
Further, rfc3986
attempts to simplify the process of adding query parameters
to a URL. For example, if we were using Elasticsearch, we might do something
like:
>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'search.example.com'
... ).add_path(
... '_search'
... ).add_query_from(
... [('q', 'repo:sigmavirus24/rfc3986'), ('sort', 'created_at:asc')]
... ).finalize().unsplit())
https://search.example.com/_search?q=repo%3Asigmavirus24%2Frfc3986&sort=created_at%3Aasc
Finally, we provide a way to add a fragment to a URL. Let’s build up a URL to view the section of the RFC that refers to fragments:
>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'tools.ietf.org'
... ).add_path(
... '/html/rfc3986'
... ).add_fragment(
... 'section-3.5'
... ).finalize().unsplit())
https://tools.ietf.org/html/rfc3986#section-3.5