Mini Shell

Direktori : /usr/lib64/python2.7/site-packages/lxml/html/
Current File : //usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyo
�
�qPc@s�dZddlmZddlmZddlmZddlm	Z	m
Z
mZy
eZ
Wnek
r{eefZ
nXyddlmZWn!ek
r�ddlmZnXyddlmZWn!ek
r�ddlmZnXdefd	��YZydd
lmZWnek
r)n Xdefd��YZe�Zd
�Zedd�Zeedd�Zeedd�Z edd�Z!edd�Z"d�Z#e�Z$dS(s?
An interface to html5lib that mimics the lxml.html interface.
i����(t
HTMLParser(tTreeBuilder(tetree(t_contains_block_level_tagtXHTML_NAMESPACEtElement(turlopen(turlparseRcBseZdZed�ZRS(s*An html5lib HTML parser with lxml as tree.cKs tj|d|dt|�dS(Ntstrictttree(t_HTMLParsert__init__R(tselfRtkwargs((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyRs(t__name__t
__module__t__doc__tFalseR(((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyRs(tXHTMLParserRcBseZdZed�ZRS(s+An html5lib XHTML Parser with lxml as tree.cKs tj|d|dt|�dS(NRR	(t_XHTMLParserRR(RRR
((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyR(s(RRRRR(((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyR%scCs6|j|�}|dk	r|S|jdt|f�S(Ns{%s}%s(tfindtNoneR(R	ttagtelem((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyt	_find_tag.scCsLt|t�std��n|dkr3t}n|j|d|�j�S(s%Parse a whole document into a string.sstring requiredt
useChardetN(t
isinstancet_stringst	TypeErrorRthtml_parsertparsetgetroot(thtmlt
guess_charsettparser((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pytdocument_fromstring5s
	cCs�t|t�std��n|dkr3t}n|j|dd|�}|r�t|dt�r�|r�|dj�r�tjd|d��n|d=q�n|S(s�Parses several HTML elements, returning a list of elements.

    The first item in the list may be a string.  If no_leading_text is true,
    then it will be an error if there is leading text, and it will always be
    a list of only elements.

    If `guess_charset` is `True` and the text was not unicode but a
    bytestring, the `chardet` library will perform charset guessing on the
    string.
    sstring requiredtdivRisThere is leading text: %rN(	RRRRRt
parseFragmenttstripRtParserError(R tno_leading_textR!R"tchildren((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pytfragments_fromstring@s		
cCs;t|t�std��nt|�}t|d|d|d|�}|r�t|t�sgd}nt|�}|r�t|dt�r�|d|_|d=n|j|�n|S|s�tj	d��nt
|�dkr�tj	d	��n|d}|jr.|jj�r.tj	d
|j��nd|_|S(sXParses a single HTML element; it is an error if there is more than
    one element, or if anything but whitespace precedes or follows the
    element.

    If create_parent is true (or is a tag name) then a parent node
    will be created to encapsulate the HTML in a single element.  In
    this case, leading or trailing text is allowed.
    sstring requiredR!R"R(R$isNo elements foundisMultiple elements foundsElement followed by text: %rN(RRRtboolR*RttexttextendRR'tlenttailR&R(R t
create_parentR!R"taccept_leading_texttelementstnew_roottresult((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pytfragment_fromstring\s2

	

	cCst|t�std��nt|d|d|�}|d j�j�}|jd�sj|jd�rn|St|d�}t|�r�|St|d�}t|�d	kr�|j	s�|j	j
�r�|d
js�|d
jj
�r�|dSt|�r
d|_
n	d
|_
|S(s�Parse the html, returning a single element/document.

    This tries to minimally parse the chunk of text, without knowing if it
    is a fragment or a document.

    base_url will set the document's base_url attribute (and the tree's docinfo.URL)
    sstring requiredR"R!i2s<htmls	<!doctypetheadtbodyii����iR$tspan(RRRR#tlstriptlowert
startswithRR.R,R&R/RR(R R!R"tdoctstartR6R7((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyt
fromstring�s$	,"	cCsj|dkrt}nt|t�s-|}n*t|�rHt|�}nt|d�}|j|d|�S(s�Parse a filename, URL, or file-like object into an HTML document
    tree.  Note: this returns a tree, not an element.  Use
    ``parse(...).getroot()`` to get the document root.
    trbRN(RRRRt_looks_like_urlRtopenR(tfilename_url_or_fileR!R"tfp((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyR�s		cCst|�d}|dkS(Nit(R(tstrtscheme((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyR@�sN(%Rthtml5libRR
t html5lib.treebuilders.etree_lxmlRtlxmlRt	lxml.htmlRRRt
basestringRt	NameErrortbytesREturllib2RtImportErrorturllib.requestRturllib.parseRRtxhtml_parserRtTrueRR#RR*R5R>RR@R(((s;/usr/lib64/python2.7/site-packages/lxml/html/html5parser.pyt<module>sB




		(*
Zerion Mini Shell 1.0