Module rfc822
[hide private]
[frames] | no frames]

Module rfc822

RFC 2822 message manipulation.

Note: This is only a very rough sketch of a full RFC-822 parser; in particular
the tokenizing of addresses does not adhere to all the quoting rules.

Note: RFC 2822 is a long awaited update to RFC 822.  This module should
conform to RFC 2822, and is thus mis-named (it's not worth renaming it).  Some
effort at RFC 2822 updates have been made, but a thorough audit has not been
performed.  Consider any RFC 2822 non-conformance to be a bug.

    RFC 2822: http://www.faqs.org/rfcs/rfc2822.html
    RFC 822 : http://www.faqs.org/rfcs/rfc822.html (obsolete)

Directions for use:

To create a Message object: first open a file, e.g.:

  fp = open(file, 'r')

You can use any other legal way of getting an open file object, e.g. use
sys.stdin or call os.popen().  Then pass the open file object to the Message()
constructor:

  m = Message(fp)

This class can work with any input object that supports a readline method.  If
the input object has seek and tell capability, the rewindbody method will
work; also illegal lines will be pushed back onto the input stream.  If the
input object lacks seek but has an `unread' method that can push back a line
of input, Message will use that to push back illegal lines.  Thus this class
can be used to parse messages coming from a buffered stream.

The optional `seekable' argument is provided as a workaround for certain stdio
libraries in which tell() discards buffered data before discovering that the
lseek() system call doesn't work.  For maximum portability, you should set the
seekable argument to zero to prevent that initial \code{tell} when passing in
an unseekable object such as a a file object created from a socket object.  If
it is 1 on entry -- which it is by default -- the tell() method of the open
file object is called once; if this raises an exception, seekable is reset to
0.  For other nonzero values of seekable, this test is not made.

To get the text of a particular header there are several methods:

  str = m.getheader(name)
  str = m.getrawheader(name)

where name is the name of the header, e.g. 'Subject'.  The difference is that
getheader() strips the leading and trailing whitespace, while getrawheader()
doesn't.  Both functions retain embedded whitespace (including newlines)
exactly as they are specified in the header, and leave the case of the text
unchanged.

For addresses and address lists there are functions

  realname, mailaddress = m.getaddr(name)
  list = m.getaddrlist(name)

where the latter returns a list of (realname, mailaddr) tuples.

There is also a method

  time = m.getdate(name)

which parses a Date-like field and returns a time-compatible tuple,
i.e. a tuple such as returned by time.localtime() or accepted by
time.mktime().

See the class definition for lower level access methods.

There are also some utility functions here.

Classes [hide private]
Message
Represents a single RFC 2822-compliant message.
AddrlistClass
Address parser class by Ben Escoto.
AddressList
An AddressList encapsulates a list of parsed RFC 2822 addresses.
Functions [hide private]
 
unquote(s)
Remove quotes from a string.
 
quote(s)
Add quotes around a string.
 
parseaddr(address)
Parse an address into a (realname, mailaddr) tuple.
 
dump_address_pair(pair)
Dump a (name, address) pair in a canonicalized form.
 
parsedate_tz(data)
Convert a date string to a time tuple.
 
parsedate(data)
Convert a time string to a time tuple.
 
mktime_tz(data)
Turn a 10-tuple as returned by parsedate_tz() into a UTC timestamp.
 
formatdate(timeval=None)
Returns time format preferred for Internet standards.
Variables [hide private]
  _blanklines = ('\r\n', '\n')
  _monthnames = ['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul'...
  _daynames = ['mon', 'tue', 'wed', 'thu', 'fri', 'sat', 'sun']
  _timezones = {'ADT': -300, 'AST': -400, 'CDT': -500, 'CST': -6...

Imports: time


Function Details [hide private]

parsedate_tz(data)

 

Convert a date string to a time tuple.

Accounts for military timezones.

formatdate(timeval=None)

 

Returns time format preferred for Internet standards.

Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123

According to RFC 1123, day and month names must always be in English. If not for that, this code could use strftime(). It can't because strftime() honors the locale and could generated non-English names.


Variables Details [hide private]

_monthnames

Value:
['jan',
 'feb',
 'mar',
 'apr',
 'may',
 'jun',
 'jul',
 'aug',
...

_timezones

Value:
{'ADT': -300,
 'AST': -400,
 'CDT': -500,
 'CST': -600,
 'EDT': -400,
 'EST': -500,
 'GMT': 0,
 'MDT': -600,
...