Parsing HTML using Python

前端 未结 7 631
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-11-22 00:35

I\'m looking for an HTML Parser module for Python that can help me get the tags in the form of Python lists/dictionaries/objects.

If I have a document of the form:

7条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-22 01:12

    I guess what you're looking for is pyquery:

    pyquery: a jquery-like library for python.

    An example of what you want may be like:

    from pyquery import PyQuery    
    html = # Your HTML CODE
    pq = PyQuery(html)
    tag = pq('div#id') # or     tag = pq('div.class')
    print tag.text()
    

    And it uses the same selectors as Firefox's or Chrome's inspect element. For example:

    the element selector is 'div#mw-head.noprint'

    The inspected element selector is 'div#mw-head.noprint'. So in pyquery, you just need to pass this selector:

    pq('div#mw-head.noprint')
    

提交回复
热议问题