html = etree.HTML(str/bytes)

参数可以是str或bytes类型，返回值是etree._Element。

调用etree.parse('hello.html')，参数是文件路径，返回值是etree._ElementTree。

etree.tostring(html,encoding='unicode')

不加编码，返回bytes，加了返回str。

etree.parse()读取文件之后用xpath不成功。<html xmlns="http://www.w3.org/1999/xhtml">把xmlns属性去掉就可以。

但是用文件以二进制打开，etree.HTML再用xpath就可以。

……

用文本文件打开，再用etree.HTML就不行。

Traceback (most recent call last):
  File "d:\我的文档\py\test\tieba\qu.py", line 53, in <module>
    html=etree.HTML(html2)
  File "src\lxml\etree.pyx", line 3178, in lxml.etree.HTML (src\lxml\etree.c:80497)
  File "src\lxml\parser.pxi", line 1866, in lxml.etree._parseMemoryDocument (src\lxml\etree.c:121177)
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.