Append markup string to a tag in BeautifulSoup

孤者浪人 提交于 2020-05-15 03:51:29

问题


Is it possible to set markup as tag content (akin to setting innerHtml in JavaScript)?

For the sake of example, let's say I want to add 10 <a> elements to a <div>, but have them separated with a comma:

soup = BeautifulSoup(<<some document here>>)

a_tags = ["<a>1</a>", "<a>2</a>", ...] # list of strings
div = soup.new_tag("div")
a_str = ",".join(a_tags)

Using div.append(a_str) escapes < and > into &lt; and &gt;, so I end up with

<div> &lt;a1&gt; 1 &lt;/a&gt; ... </div>

BeautifulSoup(a_str) wraps this in <html>, and I see getting the tree out of it as an inelegant hack.

What to do?


回答1:


You need to create a BeautifulSoup object out of your HTML string containing links:

from bs4 import BeautifulSoup

soup = BeautifulSoup()
div = soup.new_tag('div')

a_tags = ["<a>1</a>", "<a>2</a>", "<a>3</a>", "<a>4</a>", "<a>5</a>"]
a_str = ",".join(a_tags)

div.append(BeautifulSoup(a_str, 'html.parser'))

soup.append(div)
print soup

Prints:

<div><a>1</a>,<a>2</a>,<a>3</a>,<a>4</a>,<a>5</a></div>

Alternative solution:

For each link create a Tag and append it to div. Also, append a comma after each link except last:

from bs4 import BeautifulSoup

soup = BeautifulSoup()
div = soup.new_tag('div')

for x in xrange(1, 6):
    link = soup.new_tag('a')
    link.string = str(x)
    div.append(link)

    # do not append comma after the last element
    if x != 6:
        div.append(",")

soup.append(div)

print soup

Prints:

<div><a>1</a>,<a>2</a>,<a>3</a>,<a>4</a>,<a>5</a></div>


来源:https://stackoverflow.com/questions/26984933/append-markup-string-to-a-tag-in-beautifulsoup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!