I have a soup in Python like this:
Title:
Info
&l
You'll be wanting to use beautifulsoup's unwrap() for this.
import bs4
soup1 = bs4.BeautifulSoup(htm1, 'html.parser')
for match in soup1.findAll('span'):
match.unwrap()
print soup1
I wrote this function if it can help :
def deleteBalise(string):
for i in range(2):
# identifying <
rankBegin = 0
for carac in string:
if carac == '<':
break
rankBegin += 1
# identifying >
rankEnd = 0
for carac in string:
if carac == '>':
break
rankEnd += 1
stringToReplace = string[rankBegin:rankEnd+1]
string = string.replace(stringToReplace,'')
return string
You can also use replace_with
to remove span tags:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html)
for span_tag in soup.findAll('span'):
span_tag.replace_with('')
print(soup)