I have a html page which consist of a table & I want to fetch all the values in td, tr in that table.
I have tried working with beautifulsoup but now i wanted to work on
I can't add comments but it might be helpful for someone else:
I had some bold and italic text within the tables cells so c.text
returned None
. I used c.text_content()
instead like:
>>> from lxml.html import parse
>>> page = parse("test.html")
>>> rows = page.xpath("body/table")[0].findall("tr")
>>> data = list()
>>> for row in rows:
... data.append([c.text_content() for c in row.getchildren()])
...