Package epydoc :: Package markup :: Module epytext
[hide private]
[frames] | no frames]

Module epytext

source code

Parser for epytext strings. Epytext is a lightweight markup whose primary intended application is Python documentation strings. This parser converts Epytext strings to a simple DOM-like representation (encoded as a tree of Element objects and strings). Epytext strings can contain the following structural blocks:

Additionally, the following inline regions may be used within para blocks:

The returned DOM tree will conform to the the following Document Type Description:

  <!ENTITY % colorized '(code | math | index | italic |
                         bold | uri | link | symbol)*'>

  <!ELEMENT epytext ((para | literalblock | doctestblock |
                     section | ulist | olist)*, fieldlist?)>

  <!ELEMENT para (#PCDATA | %colorized;)*>

  <!ELEMENT section (para | listblock | doctestblock |
                     section | ulist | olist)+>

  <!ELEMENT fieldlist (field+)>
  <!ELEMENT field (tag, arg?, (para | listblock | doctestblock)
                               ulist | olist)+)>
  <!ELEMENT tag (#PCDATA)>
  <!ELEMENT arg (#PCDATA)>
  
  <!ELEMENT literalblock (#PCDATA | %colorized;)*>
  <!ELEMENT doctestblock (#PCDATA)>

  <!ELEMENT ulist (li+)>
  <!ELEMENT olist (li+)>
  <!ELEMENT li (para | literalblock | doctestblock | ulist | olist)+>
  <!ATTLIST li bullet NMTOKEN #IMPLIED>
  <!ATTLIST olist start NMTOKEN #IMPLIED>

  <!ELEMENT uri     (name, target)>
  <!ELEMENT link    (name, target)>
  <!ELEMENT name    (#PCDATA | %colorized;)*>
  <!ELEMENT target  (#PCDATA)>
  
  <!ELEMENT code    (#PCDATA | %colorized;)*>
  <!ELEMENT math    (#PCDATA | %colorized;)*>
  <!ELEMENT italic  (#PCDATA | %colorized;)*>
  <!ELEMENT bold    (#PCDATA | %colorized;)*>
  <!ELEMENT indexed (#PCDATA | %colorized;)>
  <!ATTLIST code style CDATA #IMPLIED>

  <!ELEMENT symbol (#PCDATA)>
Classes [hide private]
Element
A very simple DOM-like representation for parsed epytext documents.
Token
Tokens are an intermediate data structure used while constructing the structuring DOM tree for a formatted docstring.
TokenizationError
An error generated while tokenizing a formatted documentation string.
StructuringError
An error generated while structuring a formatted documentation string.
ColorizingError
An error generated while colorizing a paragraph.
ParsedEpytextDocstring
Functions [hide private]
Element
parse(str, errors=None)
Return a DOM tree encoding the contents of an epytext string.
source code
call graph 
 
_raise_graphs(tree, parent) source code
call graph 
 
_pop_completed_blocks(token, stack, indent_stack)
Pop any completed blocks off the stack.
source code
call graph 
 
_add_para(doc, para_token, stack, indent_stack, errors)
Colorize the given paragraph, and add it to the DOM tree.
source code
call graph 
 
_add_section(doc, heading_token, stack, indent_stack, errors)
Add a new section to the DOM tree, with the given heading.
source code
call graph 
 
_add_list(doc, bullet_token, stack, indent_stack, errors)
Add a new list item or field to the DOM tree, with the given bullet or field tag.
source code
call graph 
int
_tokenize_doctest(lines, start, block_indent, tokens, errors)
Construct a Token containing the doctest block starting at lines[start], and append it to tokens.
source code
call graph 
int
_tokenize_literal(lines, start, block_indent, tokens, errors)
Construct a Token containing the literal block starting at lines[start], and append it to tokens.
source code
call graph 
int
_tokenize_listart(lines, start, bullet_indent, tokens, errors)
Construct Tokens for the bullet and the first paragraph of the list item (or field) starting at lines[start], and append them to tokens.
source code
call graph 
int
_tokenize_para(lines, start, para_indent, tokens, errors)
Construct a Token containing the paragraph starting at lines[start], and append it to tokens.
source code
call graph 
list of Token
_tokenize(str, errors)
Split a given formatted docstring into an ordered list of Tokens, according to the epytext markup rules.
source code
call graph 
Element
_colorize(doc, token, errors, tagName='para')
Given a string containing the contents of a paragraph, produce a DOM Element encoding that paragraph.
source code
call graph 
 
_colorize_graph(doc, graph, token, end, errors)
Eg:
source code
call graph 
 
_colorize_link(doc, link, token, end, errors) source code
call graph 
string
to_epytext(tree, indent=0, seclevel=0)
Convert a DOM document encoding epytext back to an epytext string.
source code
string
to_plaintext(tree, indent=0, seclevel=0)
Convert a DOM document encoding epytext to a string representation.
source code
call graph 
string
to_debug(tree, indent=4, seclevel=0)
Convert a DOM document encoding epytext back to an epytext string, annotated with extra debugging information.
source code
string
to_rst(tree, indent=0, seclevel=0, wrap_startindex=0)
Convert a DOM document encoding epytext into a reStructuredText markup string.
source code
Element
pparse(str, show_warnings=1, show_errors=1, stream=sys.stderr)
Pretty-parse the string.
source code
Element
parse_as_literal(str)
Return a DOM document matching the epytext DTD, containing a single literal block.
source code
Element
parse_as_para(str)
Return a DOM document matching the epytext DTD, containing a single paragraph.
source code
call graph 
ParsedDocstring
parse_docstring(docstring, errors, **options)
Parse the given docstring, which is formatted using epytext; and return a ParsedDocstring representation of its contents.
source code
call graph 
Variables [hide private]
  _HEADING_CHARS = '=-~'
  _ESCAPES = {'lb': '{', 'rb': '}'}
  SYMBOLS = ['<-', '->', '^', 'v', 'alpha', 'beta', 'gamma', 'de...
A list of the of escape symbols that are supported by epydoc.
  _SYMBOLS = {'->': 1, '<-': 1, '<=': 1, '>=': 1, 'Alpha': 1, 'B...
  __doc__ = __doc__.replace('<<<SYMBOLS>>>', symblist)
  _COLORIZING_TAGS = {'B': 'bold', 'C': 'code', 'E': 'escape', '...
  _LINK_COLORIZING_TAGS = ['link', 'uri']
  _BULLET_RE = re.compile(r'-( +|$)|(\d+\.)+( +|$)|@\w+( [^\{\}:...
  _LIST_BULLET_RE = re.compile(r'-( +|$)|(\d+\.)+( +|$)')
  _FIELD_BULLET_RE = re.compile(r'@\w+( [^\{\}:\n]+)?:')
  _BRACE_RE = re.compile(r'[\{\}]')
  _TARGET_RE = re.compile(r'^(.*?)\s*<(?:URI:|L:)?([^<>]+)>$')
  GRAPH_TYPES = ['classtree', 'packagetree', 'importgraph', 'cal...
  SYMBOL_TO_PLAINTEXT = {'crarr': '\\'}
  SCRWIDTH = 75
Function Details [hide private]

parse(str, errors=None)

source code 
call graph 

Return a DOM tree encoding the contents of an epytext string. Any errors generated during parsing will be stored in errors.

Parameters:
  • str (string) - The epytext string to parse.
  • errors (list of ParseError) - A list where any errors generated during parsing will be stored. If no list is specified, then fatal errors will generate exceptions, and non-fatal errors will be ignored.
Returns: Element
a DOM tree encoding the contents of an epytext string.
Raises:
  • ParseError - If errors is None and an error is encountered while parsing.

_pop_completed_blocks(token, stack, indent_stack)

source code 
call graph 

Pop any completed blocks off the stack. This includes any blocks that we have dedented past, as well as any list item blocks that we've dedented to. The top element on the stack should only be a list if we're about to start a new list item (i.e., if the next token is a bullet).

_add_list(doc, bullet_token, stack, indent_stack, errors)

source code 
call graph 

Add a new list item or field to the DOM tree, with the given bullet or field tag. When necessary, create the associated list.

_tokenize_doctest(lines, start, block_indent, tokens, errors)

source code 
call graph 

Construct a Token containing the doctest block starting at lines[start], and append it to tokens. block_indent should be the indentation of the doctest block. Any errors generated while tokenizing the doctest block will be appended to errors.

Parameters:
  • lines (list of string) - The list of lines to be tokenized
  • start (int) - The index into lines of the first line of the doctest block to be tokenized.
  • block_indent (int) - The indentation of lines[start]. This is the indentation of the doctest block.
  • errors (list of ParseError) - A list where any errors generated during parsing will be stored. If no list is specified, then errors will generate exceptions.
  • tokens (list of Token)
Returns: int
The line number of the first line following the doctest block.

_tokenize_literal(lines, start, block_indent, tokens, errors)

source code 
call graph 

Construct a Token containing the literal block starting at lines[start], and append it to tokens. block_indent should be the indentation of the literal block. Any errors generated while tokenizing the literal block will be appended to errors.

Parameters:
  • lines (list of string) - The list of lines to be tokenized
  • start (int) - The index into lines of the first line of the literal block to be tokenized.
  • block_indent (int) - The indentation of lines[start]. This is the indentation of the literal block.
  • errors (list of ParseError) - A list of the errors generated by parsing. Any new errors generated while will tokenizing this paragraph will be appended to this list.
  • tokens (list of Token)
Returns: int
The line number of the first line following the literal block.

_tokenize_listart(lines, start, bullet_indent, tokens, errors)

source code 
call graph 

Construct Tokens for the bullet and the first paragraph of the list item (or field) starting at lines[start], and append them to tokens. bullet_indent should be the indentation of the list item. Any errors generated while tokenizing will be appended to errors.

Parameters:
  • lines (list of string) - The list of lines to be tokenized
  • start (int) - The index into lines of the first line of the list item to be tokenized.
  • bullet_indent (int) - The indentation of lines[start]. This is the indentation of the list item.
  • errors (list of ParseError) - A list of the errors generated by parsing. Any new errors generated while will tokenizing this paragraph will be appended to this list.
  • tokens (list of Token)
Returns: int
The line number of the first line following the list item's first paragraph.

_tokenize_para(lines, start, para_indent, tokens, errors)

source code 
call graph 

Construct a Token containing the paragraph starting at lines[start], and append it to tokens. para_indent should be the indentation of the paragraph . Any errors generated while tokenizing the paragraph will be appended to errors.

Parameters:
  • lines (list of string) - The list of lines to be tokenized
  • start (int) - The index into lines of the first line of the paragraph to be tokenized.
  • para_indent (int) - The indentation of lines[start]. This is the indentation of the paragraph.
  • errors (list of ParseError) - A list of the errors generated by parsing. Any new errors generated while will tokenizing this paragraph will be appended to this list.
  • tokens (list of Token)
Returns: int
The line number of the first line following the paragraph.

_tokenize(str, errors)

source code 
call graph 

Split a given formatted docstring into an ordered list of Tokens, according to the epytext markup rules.

Parameters:
  • str (string) - The epytext string
  • errors (list of ParseError) - A list where any errors generated during parsing will be stored. If no list is specified, then errors will generate exceptions.
Returns: list of Token
a list of the Tokens that make up the given string.

_colorize(doc, token, errors, tagName='para')

source code 
call graph 

Given a string containing the contents of a paragraph, produce a DOM Element encoding that paragraph. Colorized regions are represented using DOM Elements, and text is represented using DOM Texts.

Parameters:
  • errors (list of string) - A list of errors. Any newly generated errors will be appended to this list.
  • tagName (string) - The element tag for the DOM Element that should be generated.
Returns: Element
a DOM Element encoding the given paragraph.

_colorize_graph(doc, graph, token, end, errors)

source code 
call graph 

Eg:

 G{classtree}
 G{classtree x, y, z}
 G{importgraph}

to_epytext(tree, indent=0, seclevel=0)

source code 

Convert a DOM document encoding epytext back to an epytext string. This is the inverse operation from parse. I.e., assuming there are no errors, the following is true:

  • parse(to_epytext(tree)) == tree

The inverse is true, except that whitespace, line wrapping, and character escaping may be done differently.

  • to_epytext(parse(str)) == str (approximately)
Parameters:
  • tree (Element) - A DOM document encoding of an epytext string.
  • indent (int) - The indentation for the string representation of tree. Each line of the returned string will begin with indent space characters.
  • seclevel (int) - The section level that tree appears at. This is used to generate section headings.
Returns: string
The epytext string corresponding to tree.

to_plaintext(tree, indent=0, seclevel=0)

source code 
call graph 

Convert a DOM document encoding epytext to a string representation. This representation is similar to the string generated by to_epytext, but to_plaintext removes inline markup, prints escaped characters in unescaped form, etc.

Parameters:
  • tree (Element) - A DOM document encoding of an epytext string.
  • indent (int) - The indentation for the string representation of tree. Each line of the returned string will begin with indent space characters.
  • seclevel (int) - The section level that tree appears at. This is used to generate section headings.
Returns: string
The epytext string corresponding to tree.

to_debug(tree, indent=4, seclevel=0)

source code 

Convert a DOM document encoding epytext back to an epytext string, annotated with extra debugging information. This function is similar to to_epytext, but it adds explicit information about where different blocks begin, along the left margin.

Parameters:
  • tree (Element) - A DOM document encoding of an epytext string.
  • indent (int) - The indentation for the string representation of tree. Each line of the returned string will begin with indent space characters.
  • seclevel (int) - The section level that tree appears at. This is used to generate section headings.
Returns: string
The epytext string corresponding to tree.

to_rst(tree, indent=0, seclevel=0, wrap_startindex=0)

source code 

Convert a DOM document encoding epytext into a reStructuredText markup string. (Because rst is fairly loosely defined, it is possible that this function will produce incorrect output in some cases.)

Parameters:
  • tree (Element) - A DOM document encoding of an epytext string.
  • indent (int) - The indentation for the string representation of tree. Each line of the returned string will begin with indent space characters.
  • seclevel (int) - The section level that tree appears at. This is used to generate section headings.
Returns: string
The reStructuredText string corresponding to tree.

pparse(str, show_warnings=1, show_errors=1, stream=sys.stderr)

source code 

Pretty-parse the string. This parses the string, and catches any warnings or errors produced. Any warnings and errors are displayed, and the resulting DOM parse structure is returned.

Parameters:
  • str (string) - The string to parse.
  • show_warnings (boolean) - Whether or not to display non-fatal errors generated by parsing str.
  • show_errors (boolean) - Whether or not to display fatal errors generated by parsing str.
  • stream (stream) - The stream that warnings and errors should be written to.
Returns: Element
a DOM document encoding the contents of str.
Raises:
  • SyntaxError - If any fatal errors were encountered.

parse_as_literal(str)

source code 

Return a DOM document matching the epytext DTD, containing a single literal block. That literal block will include the contents of the given string. This method is typically used as a fall-back when the parser fails.

Parameters:
  • str (string) - The string which should be enclosed in a literal block.
Returns: Element
A DOM document containing str in a single literal block.

parse_as_para(str)

source code 
call graph 

Return a DOM document matching the epytext DTD, containing a single paragraph. That paragraph will include the contents of the given string. This can be used to wrap some forms of automatically generated information (such as type names) in paragraphs.

Parameters:
  • str (string) - The string which should be enclosed in a paragraph.
Returns: Element
A DOM document containing str in a single paragraph.

parse_docstring(docstring, errors, **options)

source code 
call graph 

Parse the given docstring, which is formatted using epytext; and return a ParsedDocstring representation of its contents.

Parameters:
  • docstring (string) - The docstring to parse
  • errors (list of ParseError) - A list where any errors generated during parsing will be stored.
  • options - Extra options. Unknown options are ignored. Currently, no extra options are defined.
Returns: ParsedDocstring

Variables Details [hide private]

SYMBOLS

A list of the of escape symbols that are supported by epydoc. Currently the following symbols are supported:
  • S{<-}=←;
  • S{->}=→;
  • S{^}=↑;
  • S{v}=↓;
  • S{alpha}=α;
  • S{beta}=β;
  • S{gamma}=γ;
  • S{delta}=δ;
  • S{epsilon}=ε;
  • S{zeta}=ζ;
  • S{eta}=η;
  • S{theta}=θ;
  • S{iota}=ι;
  • S{kappa}=κ;
  • S{lambda}=λ;
  • S{mu}=μ;
  • S{nu}=ν;
  • S{xi}=ξ;
  • S{omicron}=ο;
  • S{pi}=π;
  • S{rho}=ρ;
  • S{sigma}=σ;
  • S{tau}=τ;
  • S{upsilon}=υ;
  • S{phi}=φ;
  • S{chi}=χ;
  • S{psi}=ψ;
  • S{omega}=ω;
  • S{Alpha}=Α;
  • S{Beta}=Β;
  • S{Gamma}=Γ;
  • S{Delta}=Δ;
  • S{Epsilon}=Ε;
  • S{Zeta}=Ζ;
  • S{Eta}=Η;
  • S{Theta}=Θ;
  • S{Iota}=Ι;
  • S{Kappa}=Κ;
  • S{Lambda}=Λ;
  • S{Mu}=Μ;
  • S{Nu}=Ν;
  • S{Xi}=Ξ;
  • S{Omicron}=Ο;
  • S{Pi}=Π;
  • S{Rho}=Ρ;
  • S{Sigma}=Σ;
  • S{Tau}=Τ;
  • S{Upsilon}=Υ;
  • S{Phi}=Φ;
  • S{Chi}=Χ;
  • S{Psi}=Ψ;
  • S{Omega}=Ω;
  • S{larr}=←;
  • S{rarr}=→;
  • S{uarr}=↑;
  • S{darr}=↓;
  • S{harr}=↔;
  • S{crarr}=\;
  • S{lArr}=⇐;
  • S{rArr}=⇒;
  • S{uArr}=⇑;
  • S{dArr}=⇓;
  • S{hArr}=⇔;
  • S{copy}=©;
  • S{times}=×;
  • S{forall}=∀;
  • S{exist}=∃;
  • S{part}=∂;
  • S{empty}=∅;
  • S{isin}=∈;
  • S{notin}=∉;
  • S{ni}=∋;
  • S{prod}=∏;
  • S{sum}=∑;
  • S{prop}=∝;
  • S{infin}=∞;
  • S{ang}=∠;
  • S{and}=∧;
  • S{or}=∨;
  • S{cap}=∩;
  • S{cup}=∪;
  • S{int}=∫;
  • S{there4}=∴;
  • S{sim}=∼;
  • S{cong}=≅;
  • S{asymp}=≈;
  • S{ne}=≠;
  • S{equiv}=≡;
  • S{le}=≤;
  • S{ge}=≥;
  • S{sub}=⊂;
  • S{sup}=⊃;
  • S{nsub}=⊄;
  • S{sube}=⊆;
  • S{supe}=⊇;
  • S{oplus}=⊕;
  • S{otimes}=⊗;
  • S{perp}=⊥;
  • S{infinity}=∞;
  • S{integral}=∫;
  • S{product}=∏;
  • S{>=}=≥;
  • S{<=}=≤
Value:
['<-',
 '->',
 '^',
 'v',
 'alpha',
 'beta',
 'gamma',
 'delta',
...

_SYMBOLS

Value:
{'->': 1,
 '<-': 1,
 '<=': 1,
 '>=': 1,
 'Alpha': 1,
 'Beta': 1,
 'Chi': 1,
 'Delta': 1,
...

_COLORIZING_TAGS

Value:
{'B': 'bold',
 'C': 'code',
 'E': 'escape',
 'G': 'graph',
 'I': 'italic',
 'L': 'link',
 'M': 'math',
 'S': 'symbol',
...

_BULLET_RE

Value:
re.compile(r'-( +|$)|(\d+\.)+( +|$)|@\w+( [^\{\}:\n]+)?:')

GRAPH_TYPES

Value:
['classtree', 'packagetree', 'importgraph', 'callgraph']