ok

Mini Shell

Direktori : /usr/lib/python2.7/site-packages/kitchen/text/
Upload File :
Current File : //usr/lib/python2.7/site-packages/kitchen/text/misc.pyc

�
i�:Oc@s5dZddlZddlZddlZyddlZWnek
rSdZnXddlZddl	m
Z
ddlmZe
j
�dZeedd�dd	ged
d��Zeejee��Zejd�Zed
�Zddd�Zdd�Zd�Zdd�Zdd�ZdZdS(s�
---------------------------------------------
Miscellaneous functions for manipulating text
---------------------------------------------

Collection of text functions that don't fit in another category.
i����N(tsets(tControlCharErrorg333333�?iiiiii s(?s)<[^>]*>|&#?\w+;cCs�t|t�s'ttjd���nd}yt||d�Wntk
rZd}nX|r�tr�|r�tj	|�}|dt
kr�|d}q�n|s�d}n|S(s#Try to guess the encoding of a byte :class:`str`

    :arg byte_string: byte :class:`str` to guess the encoding of
    :kwarg disable_chardet: If this is True, we never attempt to use
        :mod:`chardet` to guess the encoding.  This is useful if you need to
        have reproducibility whether :mod:`chardet` is installed or not.
        Default: :data:`False`.
    :raises TypeError: if :attr:`byte_string` is not a byte :class:`str` type
    :returns: string containing a guess at the encoding of
        :attr:`byte_string`.  This is appropriate to pass as the encoding
        argument when encoding and decoding unicode strings.

    We start by attempting to decode the byte :class:`str` as :term:`UTF-8`.
    If this succeeds we tell the world it's :term:`UTF-8` text.  If it doesn't
    and :mod:`chardet` is installed on the system and :attr:`disable_chardet`
    is False this function will use it to try detecting the encoding of
    :attr:`byte_string`.  If it is not installed or :mod:`chardet` cannot
    determine the encoding with a high enough confidence then we rather
    arbitrarily claim that it is ``latin-1``.  Since ``latin-1`` will encode
    to every byte, decoding from ``latin-1`` to :class:`unicode` will not
    cause :exc:`UnicodeErrors` although the output might be mangled.
    s'byte_string must be a byte string (str)sutf-8tstrictt
confidencetencodingslatin-1N(t
isinstancetstrt	TypeErrortktb_tunicodetUnicodeDecodeErrortNonetchardettdetectt_CHARDET_THRESHHOLD(tbyte_stringtdisable_chardettinput_encodingtdetection_info((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytguess_encoding;s

	sutf-8treplacecCszy||ko||kSWntk
r/nXt|t�rT|j||�}n|j||�}||krvtStS(s�Compare two stringsi, converting to byte :class:`str` if one is
    :class:`unicode`

    :arg str1: First string to compare
    :arg str2: Second string to compare
    :kwarg encoding: If we need to convert one string into a byte :class:`str`
        to compare, the encoding to use.  Default is :term:`utf-8`.
    :kwarg errors: What to do if we encounter errors when encoding the string.
        See the :func:`kitchen.text.converters.to_bytes` documentation for
        possible values.  The default is ``replace``.

    This function prevents :exc:`UnicodeError` (python-2.4 or less) and
    :exc:`UnicodeWarning` (python 2.5 and higher) when we compare
    a :class:`unicode` string to a byte :class:`str`.  The errors normally
    arise because the conversion is done to :term:`ASCII`.  This function
    lets you convert to :term:`utf-8` or another encoding instead.

    .. note::

        When we need to convert one of the strings from :class:`unicode` in
        order to compare them we convert the :class:`unicode` string into
        a byte :class:`str`.  That means that strings can compare differently
        if you use different encodings for each.

    Note that ``str1 == str2`` is faster than this function if you can accept
    the following limitations:

    * Limited to python-2.5+ (otherwise a :exc:`UnicodeDecodeError` may be
      thrown)
    * Will generate a :exc:`UnicodeWarning` if non-:term:`ASCII` byte
      :class:`str` is compared to :class:`unicode` string.
    (tUnicodeErrorRR
tencodetTruetFalse(tstr1tstr2Rterrors((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytstr_eqds!
cCst|t�s'ttjd���n|dkrXtttdgt	t���}n�|dkr�tttdgt	t���}ns|dkr�d}t
|�}gtD]}||kr�|^q�r�ttjd���q�nt
tjd���|r|j|�}n|S(	s�Look for and transform :term:`control characters` in a string

    :arg string: string to search for and transform :term:`control characters`
        within
    :kwarg strategy: XML does not allow :term:`ASCII` :term:`control
        characters`.  When we encounter those we need to know what to do.
        Valid options are:

        :replace: (default) Replace the :term:`control characters`
            with ``"?"``
        :ignore: Remove the characters altogether from the output
        :strict: Raise a :exc:`~kitchen.text.exceptions.ControlCharError` when
            we encounter a control character
    :raises TypeError: if :attr:`string` is not a unicode string.
    :raises ValueError: if the strategy is not one of replace, ignore, or
        strict.
    :raises kitchen.text.exceptions.ControlCharError: if the strategy is
        ``strict`` and a :term:`control character` is present in the
        :attr:`string`
    :returns: :class:`unicode` string with no :term:`control characters` in
        it.
    sDprocess_control_char must have a unicode type as the first argument.tignoreRu?Rs*ASCII control code present in string inputsXThe strategy argument to process_control_chars must be one of ignore, replace, or strictN(RR
RRR	tdicttzipt_CONTROL_CODESRtlent	frozensett_CONTROL_CHARSRt
ValueErrort	translate(tstringtstrategyt
control_tabletdatatc((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytprocess_control_chars�s%%%cCsCd�}t|t�s0ttjd���ntjt||�S(s/Substitute unicode characters for HTML entities

    :arg string: :class:`unicode` string to substitute out html entities
    :raises TypeError: if something other than a :class:`unicode` string is
        given
    :rtype: :class:`unicode` string
    :returns: The plain text without html entities
    cSs |jd�}|d dkr#dS|d dkr�yE|d dkr`tt|dd	!d
��Stt|dd	!��SWqtk
r�qXn�|d dkrtjj|dd	!jd��}|r|d d
kr	ytt|dd	!��SWqtk
rqXqt|d�Sqn|S(Niiu<tiu&#iu&#xi����iu&sutf-8s&#s
iso-8859-1(	tgrouptunichrtintR%thtmlentitydefst
entitydefstgetRR
(tmatchR'tentity((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytfixup�s(
"
sFhtml_entities_unescape must have a unicode type for its first argument(RR
RRR	tretsubt
_ENTITY_RE(R'R6((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pythtml_entities_unescape�s		cCs^t|t�stSyt||�}Wntk
r:tSXt|�}|jt�rZtStS(s�Check that a byte :class:`str` would be valid in xml

    :arg byte_string: Byte :class:`str` to check
    :arg encoding: Encoding of the xml file.  Default: :term:`UTF-8`
    :returns: :data:`True` if the string is valid.  :data:`False` if it would
        be invalid in the xml file

    In some cases you'll have a whole bunch of byte strings and rather than
    transforming them to :class:`unicode` and back to byte :class:`str` for
    output to xml, you will just want to make sure they work with the xml file
    you're constructing.  This function will help you do that.  Example::

        ARRAY_OF_MOSTLY_UTF8_STRINGS = [...]
        processed_array = []
        for string in ARRAY_OF_MOSTLY_UTF8_STRINGS:
            if byte_string_valid_xml(string, 'utf-8'):
                processed_array.append(string)
            else:
                processed_array.append(guess_bytes_to_xml(string, encoding='utf-8'))
        output_xml(processed_array)
    (	RRRR
RR#tintersectionR$R(RRtu_stringR*((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytbyte_string_valid_xml�s
cCs*yt||�Wntk
r%tSXtS(s�Detect if a byte :class:`str` is valid in a specific encoding

    :arg byte_string: Byte :class:`str` to test for bytes not valid in this
        encoding
    :kwarg encoding: encoding to test against.  Defaults to :term:`UTF-8`.
    :returns: :data:`True` if there are no invalid :term:`UTF-8` characters.
        :data:`False` if an invalid character is detected.

    .. note::

        This function checks whether the byte :class:`str` is valid in the
        specified encoding.  It **does not** detect whether the byte
        :class:`str` actually was encoded in that encoding.  If you want that
        sort of functionality, you probably want to use
        :func:`~kitchen.text.misc.guess_encoding` instead.
    (R
RRR(RR((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytbyte_string_valid_encodings

R>R=RR:R,R(sbyte_string_valid_encodingsbyte_string_valid_xmlsguess_encodingshtml_entities_unescapesprocess_control_charssstr_eq(t__doc__R1t	itertoolsR7R
tImportErrorRtkitchenRtkitchen.pycompat24Rtkitchen.text.exceptionsRtadd_builtin_setRR#trangeR!timapR/R$tcompileR9RRRR,R:R=R>t__all__(((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pyt<module>s0


,)/8	*(

Zerion Mini Shell 1.0