How to change tag name with BeautifulSoup?

后端 未结 3 395
独厮守ぢ
独厮守ぢ 2020-12-03 16:53

I am using python + BeautifulSoup to parse an HTML document.

Now I need to replace all

elements in an HTML document, wit

相关标签:
3条回答
  • 2020-12-03 17:43

    It's just:

    tag.name = 'new_name'
    
    0 讨论(0)
  • 2020-12-03 17:46

    I don't know how you're accessing tag but the following works for me:

    import BeautifulSoup
    
    if __name__ == "__main__":
        data = """
    <html>
    <h2 class='someclass'>some title</h2>
    <ul>
       <li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
       <li>Aliquam tincidunt mauris eu risus.</li>
       <li>Vestibulum auctor dapibus neque.</li>
    </ul>
    </html>
    
        """
        soup = BeautifulSoup.BeautifulSoup(data)
        h2 = soup.find('h2')
        h2.name = 'h1'
        print soup
    

    Output of print soup command is:

    <html>
    <h1 class='someclass'>some title</h1>
    <ul>
    <li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
    <li>Aliquam tincidunt mauris eu risus.</li>
    <li>Vestibulum auctor dapibus neque.</li>
    </ul>
    </html>
    

    As you can see, h2 became h1. And nothing else in the document changed. I am using Python 2.6 and BeautifulSoup 3.2.0.

    If you have more than one h2 and you want to change them all, you could simple do:

    soup = BeautifulSoup.BeautifulSoup(your_data)
    while True: 
        h2 = soup.find('h2')
        if not h2:
            break
        h2.name = 'h1'
    
    0 讨论(0)
  • 2020-12-03 17:59

    From BeautifulSoup docs

    from BeautifulSoup import BeautifulSoup, Tag
    soup = BeautifulSoup("<h2 class="someclass">TEXTHERE</h2>")
    tag = Tag(soup, "h1", [("class", "someclass")])
    tag.insert(0, "TEXTHERE")
    soup.h2.replaceWith(tag)
    print soup
    # <h1 class="someclass">TEXTHERE</h1>
    
    0 讨论(0)
提交回复
热议问题