发表新帖

发表新帖

How to obtain the content between a tag and it's ending in HTML using python' beautiful soup?

后端未结

关注

 2  957

广开言路 2021-01-24 02:07

I have a HTML line as follows:

Is this model too thin for Yves Saint Laurent?

I would lik

2条回答

离开以前 (楼主)

2021-01-24 02:21
Instead of using regular expressions, you should use some html parser like BeautifulSoup. You can also use etree library with xpath for complicated use cases.

Still, if you want to use regex -

Regular Expression is a Domain-Specific Language that makes string parsing and processing a lot more easier. Although, some people may disagree regular expressions provide much elegant solutions to problem, that looping over string could ever be.-
```
import re
html_string = 'Is this model too thin for Yves Saint Laurent? '
regex = re.compile(r'(?<=>).*(?=<)')
result = regex.findall(html_string)[0]
```
In this regex, I am using look-ahead and look-behind of regular expressions. As far as learning regular expressions is concerned, it takes rather considerable amount of time. I recommend going through some good tutorial or some book on regex.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题