Scraping data from the tag names in python

前端 未结 2 843
伪装坚强ぢ
伪装坚强ぢ 2021-01-25 06:40

Hi I am trying to scrape user data from a website. I need User ID which are available in the tag names itself.I am trying to scrape the UID using python selenium and beautiful s

2条回答
  •  余生分开走
    2021-01-25 07:40

    Assuming the id attribute value is always in the format UID_ followed by one or more alphanumeric characters followed by -SRC_ followed by one or more digits:

    import re
    from bs4 import BeautifulSoup
    
    soup = BeautifulSoup(html)
    
    pattern = re.compile(r"UID_(\w+)\-SRC_\d+")
    id = soup.find("div", id=pattern)["id"]
    
    uid = pattern.match(id).group(1)
    print(uid)
    

    Here we are using BeautifulSoup and searching for an id attribute value to match a specific regular expression. It contains a saving group (\w+) that helps us to extract the UID value.

提交回复
热议问题