问题
Is it possible to set markup as tag content (akin to setting innerHtml
in JavaScript)?
For the sake of example, let's say I want to add 10 <a>
elements to a <div>
, but have them separated with a comma:
soup = BeautifulSoup(<<some document here>>)
a_tags = ["<a>1</a>", "<a>2</a>", ...] # list of strings
div = soup.new_tag("div")
a_str = ",".join(a_tags)
Using div.append(a_str)
escapes <
and >
into <
and >
, so I end up with
<div> <a1> 1 </a> ... </div>
BeautifulSoup(a_str)
wraps this in <html>
, and I see getting the tree out of it as an inelegant hack.
What to do?
回答1:
You need to create a BeautifulSoup
object out of your HTML
string containing links:
from bs4 import BeautifulSoup
soup = BeautifulSoup()
div = soup.new_tag('div')
a_tags = ["<a>1</a>", "<a>2</a>", "<a>3</a>", "<a>4</a>", "<a>5</a>"]
a_str = ",".join(a_tags)
div.append(BeautifulSoup(a_str, 'html.parser'))
soup.append(div)
print soup
Prints:
<div><a>1</a>,<a>2</a>,<a>3</a>,<a>4</a>,<a>5</a></div>
Alternative solution:
For each link create a Tag
and append it to div
. Also, append a comma after each link except last:
from bs4 import BeautifulSoup
soup = BeautifulSoup()
div = soup.new_tag('div')
for x in xrange(1, 6):
link = soup.new_tag('a')
link.string = str(x)
div.append(link)
# do not append comma after the last element
if x != 6:
div.append(",")
soup.append(div)
print soup
Prints:
<div><a>1</a>,<a>2</a>,<a>3</a>,<a>4</a>,<a>5</a></div>
来源:https://stackoverflow.com/questions/26984933/append-markup-string-to-a-tag-in-beautifulsoup