问题
I'm trying to get all text from within this HTML tag, which I store in variable tag
:
<td rowspan="2" style="text-align: center;"><a href="/wiki/Glenn_Miller" title="Glenn Miller">Glenn Miller</a> & His Orchestra</td>
The result should be "Glenn Miller & His Orchestra"
.
But print
ing tag.find(text=True)
returns this: "Glenn Miller"
.
How do I get the rest of the text within the td
element?
回答1:
tag.find(text=True)
would return the first matching text node. Use .get_text() instead:
>>> from bs4 import BeautifulSoup
>>> data = '<td rowspan="2" style="text-align: center;"><a href="/wiki/Glenn_Miller" title="Glenn Miller">Glenn Miller</a> & His Orchestra</td>'
>>> soup = BeautifulSoup(data, "html.parser")
>>> tag = soup.td
>>> tag.get_text()
'Glenn Miller & His Orchestra'
来源:https://stackoverflow.com/questions/37336326/how-do-i-get-all-text-from-within-this-tag