When parsing html why do I need item.text sometimes and item.text_content() others

后端未结

关注

 2  1889

花落未央 2021-01-17 19:23

Still learning lxml. I discovered that sometimes I cannot get to the text of an item from a tree using item.text. If I use item.text_content() I am good to go. I am not s

2条回答

有刺的猬 (楼主)

2021-01-17 20:02

You maybe confusing different and incompatible interfaces that lxml implements -- the lxml.etree items have a .text attribute, while (for example) those from lxml.html implement the text_content method (and those from BeautifulSoup, also included in lxml, have a .string attribute... sometimes [[only nodes with a single child which is a string...]]).

Yeah, it is inherently confusing that lxml chooses both to implement its own interfaces and emulate or include other libraries, but it can be convenient...;-).

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...