Module urlparse
[hide private]
[frames] | no frames]

Module urlparse

Parse (absolute and relative) URLs.

See RFC 1808: "Relative Uniform Resource Locators", by R. Fielding, UC Irvine, June 1995.

Classes [hide private]
BaseResult
Base class for the parsed result objects.
SplitResult
ParseResult
Functions [hide private]
 
clear_cache()
Clear the parse cache.
 
urlparse(url, scheme='', allow_fragments=True)
Parse a URL into 6 components: <scheme>://<netloc>/<path>;<params>?<query>#<fragment> Return a 6-tuple: (scheme, netloc, path, params, query, fragment).
 
_splitparams(url)
 
_splitnetloc(url, start=0)
 
urlsplit(url, scheme='', allow_fragments=True)
Parse a URL into 5 components: <scheme>://<netloc>/<path>?<query>#<fragment> Return a 5-tuple: (scheme, netloc, path, query, fragment).
 
urlunparse((scheme, netloc, url, params, query, fragment))
Put a parsed URL back together again.
 
urlunsplit((scheme, netloc, url, query, fragment))
 
urljoin(base, url, allow_fragments=True)
Join a base URL and a possibly relative URL to form an absolute interpretation of the latter.
 
urldefrag(url)
Removes any existing fragment from URL.
 
test()
Variables [hide private]
  uses_relative = ['ftp', 'http', 'gopher', 'nntp', 'imap', 'wai...
  uses_netloc = ['ftp', 'http', 'gopher', 'nntp', 'telnet', 'ima...
  non_hierarchical = ['gopher', 'hdl', 'mailto', 'news', 'telnet...
  uses_params = ['ftp', 'hdl', 'prospero', 'http', 'imap', 'http...
  uses_query = ['http', 'wais', 'imap', 'https', 'shttp', 'mms',...
  uses_fragment = ['ftp', 'hdl', 'http', 'gopher', 'news', 'nntp...
  scheme_chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRST...
  MAX_CACHE_SIZE = 20
  _parse_cache = {}
  test_input = '\n http://a/b/c/d\n\n g:h = <UR...
Function Details [hide private]

urlparse(url, scheme='', allow_fragments=True)

 

Parse a URL into 6 components: <scheme>://<netloc>/<path>;<params>?<query>#<fragment> Return a 6-tuple: (scheme, netloc, path, params, query, fragment). Note that we don't break the components up in smaller bits (e.g. netloc is a single string) and we don't expand % escapes.

urlsplit(url, scheme='', allow_fragments=True)

 

Parse a URL into 5 components: <scheme>://<netloc>/<path>?<query>#<fragment> Return a 5-tuple: (scheme, netloc, path, query, fragment). Note that we don't break the components up in smaller bits (e.g. netloc is a single string) and we don't expand % escapes.

urlunparse((scheme, netloc, url, params, query, fragment))

 

Put a parsed URL back together again. This may result in a slightly different, but equivalent URL, if the URL that was parsed originally had redundant delimiters, e.g. a ? with an empty query (the draft states that these are equivalent).

urldefrag(url)

 

Removes any existing fragment from URL.

Returns a tuple of the defragmented URL and the fragment. If the URL contained no fragments, the second element is the empty string.


Variables Details [hide private]

uses_relative

Value:
['ftp',
 'http',
 'gopher',
 'nntp',
 'imap',
 'wais',
 'file',
 'https',
...

uses_netloc

Value:
['ftp',
 'http',
 'gopher',
 'nntp',
 'telnet',
 'imap',
 'wais',
 'file',
...

non_hierarchical

Value:
['gopher',
 'hdl',
 'mailto',
 'news',
 'telnet',
 'wais',
 'imap',
 'snews',
...

uses_params

Value:
['ftp',
 'hdl',
 'prospero',
 'http',
 'imap',
 'https',
 'shttp',
 'rtsp',
...

uses_query

Value:
['http',
 'wais',
 'imap',
 'https',
 'shttp',
 'mms',
 'gopher',
 'rtsp',
...

uses_fragment

Value:
['ftp',
 'hdl',
 'http',
 'gopher',
 'news',
 'nntp',
 'wais',
 'https',
...

scheme_chars

Value:
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-.'

test_input

Value:
'''
      http://a/b/c/d

      g:h        = <URL:g:h>
      http:g     = <URL:http://a/b/c/g>
      http:      = <URL:http://a/b/c/d>
      g          = <URL:http://a/b/c/g>
      ./g        = <URL:http://a/b/c/g>
...