Matching partial ids in BeautifulSoup

前端 未结 4 495
一整个雨季
一整个雨季 2020-11-29 07:21

I\'m using BeautifulSoup. I have to find any reference to the

tags with id like: post-#.

For example:

相关标签:
4条回答
  • 2020-11-29 07:52

    Since he is asking to match "post-#somenumber#", it's better to precise with

    import re
    [...]
    soupHandler.findAll('div', id=re.compile("^post-\d+"))
    
    0 讨论(0)
  • 2020-11-29 08:01
    soupHandler.findAll('div', id=re.compile("^post-$"))
    

    looks right to me.

    0 讨论(0)
  • 2020-11-29 08:01

    This works for me:

    from bs4 import BeautifulSoup
    import re
    
    html = '<div id="post-45">...</div> <div id="post-334">...</div>'
    soupHandler = BeautifulSoup(html)
    
    for match in soupHandler.find_all('div', id=re.compile("post-")):
        print match.get('id')
    
    >>> 
    post-45
    post-334
    
    0 讨论(0)
  • 2020-11-29 08:03

    You can pass a function to findAll:

    >>> print soupHandler.findAll('div', id=lambda x: x and x.startswith('post-'))
    [<div id="post-45">...</div>, <div id="post-334">...</div>]
    

    Or a regular expression:

    >>> print soupHandler.findAll('div', id=re.compile('^post-'))
    [<div id="post-45">...</div>, <div id="post-334">...</div>]
    
    0 讨论(0)
提交回复
热议问题