Scraping data from the tag names in python

前端未结

关注

 2  842

Hi I am trying to scrape user data from a website. I need User ID which are available in the tag names itself.I am trying to scrape the UID using python selenium and beautiful s

相关标签:

2条回答

渐次进展

2021-01-25 07:35

you can use .get method and scrape the tag names easily,

in your question;

soup.get('id')

of course, if there are many id tags exist, you need to use more specific tags with find or find_all method before using the .get

0 讨论(0)
发布评论:

提交评论
- 加载中...
余生分开走

2021-01-25 07:40
Assuming the id attribute value is always in the format UID_ followed by one or more alphanumeric characters followed by -SRC_ followed by one or more digits:
```
import re
from bs4 import BeautifulSoup

soup = BeautifulSoup(html)

pattern = re.compile(r"UID_(\w+)\-SRC_\d+")
id = soup.find("div", id=pattern)["id"]

uid = pattern.match(id).group(1)
print(uid)
```
Here we are using BeautifulSoup and searching for an id attribute value to match a specific regular expression. It contains a saving group (\w+) that helps us to extract the UID value.
0 讨论(0)
发布评论:

提交评论
- 加载中...