Module codecs
codecs -- Python Codec Registry, API and helpers.
Written by Marc-Andre Lemburg (mal@lemburg.com).
(c) Copyright CNRI, All Rights Reserved. NO WARRANTY.
CodecInfo
|
Codec
Defines the interface for stateless encoders/decoders.
|
IncrementalEncoder
An IncrementalEncoder encodes an input in multiple steps.
|
BufferedIncrementalEncoder
This subclass of IncrementalEncoder can be used as the baseclass
for an incremental encoder if the encoder must keep some of the
output in a buffer between calls to encode().
|
IncrementalDecoder
An IncrementalDecoder decodes an input in multiple steps.
|
BufferedIncrementalDecoder
This subclass of IncrementalDecoder can be used as the baseclass
for an incremental decoder if the decoder must be able to handle
incomplete byte sequences.
|
StreamWriter
|
StreamReader
|
StreamReaderWriter
StreamReaderWriter instances allow wrapping streams which work in
both read and write modes.
|
StreamRecoder
StreamRecoder instances provide a frontend - backend view of
encoding data.
|
|
open(filename,
mode=' rb ' ,
encoding=None,
errors=' strict ' ,
buffering=1)
Open an encoded file using the given mode and return a wrapped
version providing transparent encoding/decoding. |
|
|
|
EncodedFile(file,
data_encoding,
file_encoding=None,
errors=' strict ' )
Return a wrapped version of file which provides transparent encoding
translation. |
|
|
|
getencoder(encoding)
Lookup up the codec for the given encoding and return its encoder
function. |
|
|
|
getdecoder(encoding)
Lookup up the codec for the given encoding and return its decoder
function. |
|
|
|
getincrementalencoder(encoding)
Lookup up the codec for the given encoding and return its
IncrementalEncoder class or factory function. |
|
|
|
getincrementaldecoder(encoding)
Lookup up the codec for the given encoding and return its
IncrementalDecoder class or factory function. |
|
|
|
getreader(encoding)
Lookup up the codec for the given encoding and return its
StreamReader class or factory function. |
|
|
|
getwriter(encoding)
Lookup up the codec for the given encoding and return its
StreamWriter class or factory function. |
|
|
|
iterencode(iterator,
encoding,
errors=' strict ' ,
**kwargs)
Encoding iterator. |
|
|
|
iterdecode(iterator,
encoding,
errors=' strict ' ,
**kwargs)
Decoding iterator. |
|
|
dict
|
make_identity_dict(rng)
Return a dictionary where elements of the rng sequence are mapped to
themselves. |
|
|
|
|
|
|
|
|
|
|
|
xmlcharrefreplace_errors(...) |
|
|
|
backslashreplace_errors(...) |
|
|
(encoder, decoder, stream_reader, stream_writer)
|
lookup(encoding)
Looks up a codec tuple in the Python codec registry and returns a
tuple of functions. |
|
|
handler
|
lookup_error(errors)
Return the error handler for the specified error handling name or
raise a LookupError, if no handler exists under this name. |
|
|
|
register(search_function)
Register a codec search function. |
|
|
|
register_error(errors,
handler)
Register the specified error handler under the name errors. |
|
|
|
BOM_UTF8 = ' \xef\xbb\xbf '
|
|
BOM_UTF16_LE = ' \xff\xfe '
|
|
BOM_LE = ' \xff\xfe '
|
|
BOM_UTF16_BE = ' \xfe\xff '
|
|
BOM_BE = ' \xfe\xff '
|
|
BOM_UTF32_LE = ' \xff\xfe\x00\x00 '
|
|
BOM_UTF32_BE = ' \x00\x00\xfe\xff '
|
|
BOM_UTF16 = ' \xff\xfe '
|
|
BOM = ' \xff\xfe '
|
|
BOM_UTF32 = ' \xff\xfe\x00\x00 '
|
|
BOM32_LE = ' \xff\xfe '
|
|
BOM32_BE = ' \xfe\xff '
|
|
BOM64_LE = ' \xff\xfe\x00\x00 '
|
|
BOM64_BE = ' \x00\x00\xfe\xff '
|
|
_false = 0
|
Imports:
__builtin__,
sys,
encodings,
ascii_decode,
ascii_encode,
charbuffer_encode,
charmap_build,
charmap_decode,
charmap_encode,
decode,
encode,
escape_decode,
escape_encode,
latin_1_decode,
latin_1_encode,
raw_unicode_escape_decode,
raw_unicode_escape_encode,
readbuffer_encode,
unicode_escape_decode,
unicode_escape_encode,
unicode_internal_decode,
unicode_internal_encode,
utf_16_be_decode,
utf_16_be_encode,
utf_16_decode,
utf_16_encode,
utf_16_ex_decode,
utf_16_le_decode,
utf_16_le_encode,
utf_7_decode,
utf_7_encode,
utf_8_decode,
utf_8_encode
open(filename,
mode=' rb ' ,
encoding=None,
errors=' strict ' ,
buffering=1)
|
|
Open an encoded file using the given mode and return a wrapped version
providing transparent encoding/decoding.
Note: The wrapped version will only accept the object format defined
by the codecs, i.e. Unicode objects for most builtin codecs. Output is
also codec dependent and will usually be Unicode as well.
Files are always opened in binary mode, even if no binary mode was
specified. This is done to avoid data loss due to encodings using 8-bit
values. The default file mode is 'rb' meaning to open the file in binary
read mode.
encoding specifies the encoding which is to be used for the file.
errors may be given to define the error handling. It defaults to
'strict' which causes ValueErrors to be raised in case an encoding error
occurs.
buffering has the same meaning as for the builtin open() API. It
defaults to line buffered.
The returned wrapped file object provides an extra attribute .encoding
which allows querying the used encoding. This attribute is only available
if an encoding was specified as parameter.
|
EncodedFile(file,
data_encoding,
file_encoding=None,
errors=' strict ' )
|
|
Return a wrapped version of file which provides transparent encoding
translation.
Strings written to the wrapped file are interpreted according to the
given data_encoding and then written to the original file as string using
file_encoding. The intermediate encoding will usually be Unicode but
depends on the specified codecs.
Strings are read from the file using file_encoding and then passed
back to the caller as string using data_encoding.
If file_encoding is not given, it defaults to data_encoding.
errors may be given to define the error handling. It defaults to
'strict' which causes ValueErrors to be raised in case an encoding error
occurs.
The returned wrapped file object provides two extra attributes
.data_encoding and .file_encoding which reflect the given parameters of
the same name. The attributes can be used for introspection by Python
programs.
|
Lookup up the codec for the given encoding and return its encoder
function.
Raises a LookupError in case the encoding cannot be found.
|
Lookup up the codec for the given encoding and return its decoder
function.
Raises a LookupError in case the encoding cannot be found.
|
getincrementalencoder(encoding)
|
|
Lookup up the codec for the given encoding and return its
IncrementalEncoder class or factory function.
Raises a LookupError in case the encoding cannot be found or the
codecs doesn't provide an incremental encoder.
|
getincrementaldecoder(encoding)
|
|
Lookup up the codec for the given encoding and return its
IncrementalDecoder class or factory function.
Raises a LookupError in case the encoding cannot be found or the
codecs doesn't provide an incremental decoder.
|
Lookup up the codec for the given encoding and return its StreamReader
class or factory function.
Raises a LookupError in case the encoding cannot be found.
|
Lookup up the codec for the given encoding and return its StreamWriter
class or factory function.
Raises a LookupError in case the encoding cannot be found.
|
iterencode(iterator,
encoding,
errors=' strict ' ,
**kwargs)
|
|
Encoding iterator.
Encodes the input strings from the iterator using a
IncrementalEncoder.
errors and kwargs are passed through to the IncrementalEncoder
constructor.
|
iterdecode(iterator,
encoding,
errors=' strict ' ,
**kwargs)
|
|
Decoding iterator.
Decodes the input strings from the iterator using a
IncrementalDecoder.
errors and kwargs are passed through to the IncrementalDecoder
constructor.
|
make_encoding_map(decoding_map)
|
|
Creates an encoding map from a decoding map.
If a target mapping in the decoding map occurs multiple times, then
that target is mapped to None (undefined mapping), causing an exception
when encountered by the charmap codec during translation.
One example where this happens is cp875.py which decodes multiple
character to \u001a.
|
register(search_function)
|
|
Register a codec search function. Search functions are expected to
take one argument, the encoding name in all lower case letters, and
return a tuple of functions (encoder, decoder, stream_reader,
stream_writer).
|
register_error(errors,
handler)
|
|
Register the specified error handler under the name errors. handler
must be a callable object, that will be called with an exception instance
containing information about the location of the encoding/decoding error
and must return a (replacement, new position) tuple.
|