Python Regex can't find substring but it should

前端未结

关注

 2  887

情深已故 2021-01-25 20:49

I am trying to parse html using BeautifulSoup to try and extract the webpage title. Sometimes this does not work due to the website being badly written, such as Bad End tag. W

2条回答

无人共我 (楼主)

2021-01-25 21:09

If you want to grab the test between the </code> and <code><\title></code> tags you should use this regexp:</p> <pre><code>pattern = "<title>([^<]+)" re.findall(pattern, html_string)

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...