How to Beautiful Soup (bs4) match just one, and only one, css class

后端 未结 7 614
说谎
说谎 2020-12-10 20:28

I am using following code to match all div that have CSS class \"ad_item\".

soup.find_all(\'div\',class_=\"ad_item\")

problem that I have i

相关标签:
7条回答
  • 2020-12-10 21:01

    You can pass a lambda functions to find and find_all methods.

    soup.find_all(lambda x:
        x.name == 'div' and
        'ad_item' in x.get('class', []) and
        not 'ad_ex_item' in x['class']
    )
    

    The x.get('class', []) will avoid KeyError exceptions for div tags without class attribute.

    If you need to exclude more than only one class you can substitute the last condition with:

        not any(c in x['class'] for c in {'ad_ex_item', 'another_class'})
    

    And if you want to exclude exactly some classes you can use:

       not all(c in x['class'] for c in {'ad_ex_item', 'another_class'})
    
    0 讨论(0)
提交回复
热议问题