parsing HTML table using python - HTMLparser or lxml

后端未结

关注

 2  866

我寻月下人不归 2021-02-14 04:09

I have a html page which consist of a table & I want to fetch all the values in td, tr in that table.
I have tried working with beautifulsoup but now i wanted to work on

2条回答

故里飘歌 (楼主)

2021-02-14 04:42
I can't add comments but it might be helpful for someone else:

I had some bold and italic text within the tables cells so c.text returned None. I used c.text_content() instead like:
```
>>> from lxml.html import parse
>>> page = parse("test.html")
>>> rows = page.xpath("body/table")[0].findall("tr")
>>> data = list()
>>> for row in rows:
...     data.append([c.text_content() for c in row.getchildren()])
... 
```
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...