Parsing HTML using Python

前端 未结 7 630
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-11-22 00:35

I\'m looking for an HTML Parser module for Python that can help me get the tags in the form of Python lists/dictionaries/objects.

If I have a document of the form:

7条回答
  •  迷失自我
    2020-11-22 01:17

    I recommend lxml for parsing HTML. See "Parsing HTML" (on the lxml site).

    In my experience Beautiful Soup messes up on some complex HTML. I believe that is because Beautiful Soup is not a parser, rather a very good string analyzer.

提交回复
热议问题