How to tell BeautifulSoup to extract the content of a specific tag as text? (without touching it)

后端 未结 1 583
长情又很酷
长情又很酷 2021-01-03 06:48

I need to parse an html document which contains \"code\" tags

I\'m getting the code blocks like this:

soup = BeautifulSoup(str(content))
code_blocks          


        
相关标签:
1条回答
  • 2021-01-03 07:27

    Add the code tag to the QUOTE_TAGS dictionary.

    from BeautifulSoup import BeautifulSoup
    
    content = "<code class='csharp'>List<Person> persons = new List<Person>();</code>"
    
    BeautifulSoup.QUOTE_TAGS['code'] = None
    soup = BeautifulSoup(str(content))
    code_blocks = soup.findAll('code')
    

    Output:

    [<code class="csharp"> List<Person> persons = new List<Person>(); </code>]
    
    0 讨论(0)
提交回复
热议问题