Beautiful Soup 4: How to replace a tag with text and another tag?

三世轮回 提交于 2021-01-21 03:58:06

问题


I want to replace a tag with another tag and put the contents of the old tag before the new one. For example:

I want to change this:

<html>
<body>
<p>This is the <span id="1">first</span> paragraph</p>
<p>This is the <span id="2">second</span> paragraph</p>
</body>
</html>

into this:

<html>
<body>
<p>This is the first<sup>1</sup> paragraph</p>
<p>This is the second<sup>2</sup> paragraph</p>
</body>
</html>

I can easily find all spans with find_all(), get the number from the id attribute and replace one tag with another tag using replace_with(), but how do I replace a tag with text and a new tag or insert text before a replaced tag?


回答1:


The idea is to find every span tag with id attribute (span[id] CSS Selector), use insert_after() to insert a sup tag after it and unwrap() to replace the tag with it's contents:

from bs4 import BeautifulSoup

data = """
<html>
<body>
<p>This is the <span id="1">first</span> paragraph</p>
<p>This is the <span id="2">second</span> paragraph</p>
</body>
</html>
"""

soup = BeautifulSoup(data)
for span in soup.select('span[id]'):
    # insert sup tag after the span
    sup = soup.new_tag('sup')
    sup.string = span['id']
    span.insert_after(sup)

    # replace the span tag with it's contents
    span.unwrap()

print soup

Prints:

<html>
<body>
<p>This is the first<sup>1</sup> paragraph</p>
<p>This is the second<sup>2</sup> paragraph</p>
</body>
</html>


来源:https://stackoverflow.com/questions/27006463/beautiful-soup-4-how-to-replace-a-tag-with-text-and-another-tag

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!