BeautifulSoup order of occurrence of Tags

后端 未结 1 479
深忆病人
深忆病人 2021-01-24 03:42

Consider the following situation:

tag1 = soup.find(**data_attrs)
tag2 = soup.find(**delim_attrs)

Is there a way to find out which tag occurred

相关标签:
1条回答
  • 2021-01-24 04:27

    BeautifulSoup tags don't track their order in the page, no. You'd have to loop over all tags again and find your two tags in that list.

    Using the standard sample BeautifulSoup tree:

    >>> tag1 = soup.find(id='link1')
    >>> tag2 = soup.find(id='link2')
    >>> tag1, tag2
    (<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>, <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>)
    >>> all_tags = soup.find_all(True)
    >>> all_tags.index(tag1)
    6
    >>> all_tags.index(tag2)
    7
    

    I'd use a tag.find_all() with a function to match both tag types instead; that way you get a list of the tags and can see their relative order:

    tag_match = lambda el: (
        getattr(el, 'name', None) in ('tagname1', 'tagname2') and
        el.attrs.get('attributename') == 'something' and 
        'classname' in el.attrs.get('class')
    )
    tags = soup.find(tag_match)
    

    or you can use the .next_siblings iterator to loop over all elements in the same parent and see if the delimiter comes next, etc.

    0 讨论(0)
提交回复
热议问题