Replace HTML links with text

烈酒焚心 提交于 2019-12-11 01:18:47

问题


How to replace links with anchors in html (python)?

for example input:

 <p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>

i want at result with saved p tag (just a tag remove):

<p>
Hello link text1 and link text2 ! 
</p>

回答1:


You could do this with a simple regex and the sub function:

import re

text = '<p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>'
pattern =r'<(a|/a).*?>'

result = re.sub(pattern , "", text)

print result
'<p> Hello link text1 and link text2 ! </p>'

This code replaces all occuring <a..> and </a> tags with an empty string.




回答2:


Looks like a perfect case for BeautifulSoup's unwrap() method:

from bs4 import BeautifulSoup
data = '''<p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>'''
soup = BeautifulSoup(data)
p_tag = soup.find('p')
for _ in p_tag.find_all('a'):
    p_tag.a.unwrap()
print p_tag

This gives:

<p> Hello link text1 and link text2 ! </p>



回答3:


You can use Parser Library for it.. like BeautifulSoup and other also. I am not sure for it, but you can get something here



来源:https://stackoverflow.com/questions/24157298/replace-html-links-with-text

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!