beautifulsoup, html5lib: module object has no attribute _base

前端 未结 8 2100
逝去的感伤
逝去的感伤 2020-12-04 18:57

When I updated my packages I have this new error:

class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder):
AttributeError: \'module\' object ha         


        
相关标签:
8条回答
  • 2020-12-04 19:15

    This is an issue with upstream package html5lib: https://bugs.launchpad.net/beautifulsoup/+bug/1603299 To fix, force downgrade to an older version:

    pip install --upgrade html5lib==1.0b8

    0 讨论(0)
  • 2020-12-04 19:16

    I upgraded beautifulsoup4 and html5lib and it resolved the issue.

    pip install --upgrade beautifulsoup4
    pip install --upgrade html5lib
    
    0 讨论(0)
  • 2020-12-04 19:26

    I found trying to switch versions did not work for me. In the end, based on this issue I edited the relevant file at ~/.local/lib/python3.7/site-packages/bs4/builder/_html5lib.py for my purposes.

    0 讨论(0)
  • 2020-12-04 19:30

    Just install html5lib using this because if you install the normal way then you have to spider using python2.

    sudo pip3 install html5lib==0.9999999
    
    0 讨论(0)
  • 2020-12-04 19:32

    edit nov, 2017: it seems this doesn't work any more

    Finally found out, a search engine didn't throw anything but it's referenced on beautifulsoup's issue tracker: https://bugs.launchpad.net/beautifulsoup/+bug/1603299

    it works back with html5lib v0.9999999 (7 nines)

    "html5lib<=0.9999999"
    
    0 讨论(0)
  • 2020-12-04 19:38

    The same problem occurred on me. I don't know what you were trying to do, but it occurred on me when I tried to read XML file in pandas, using pd.read_html().

    The problem is fixed by upgrading all of beautifulsoup4, html5lib, and lxml, like:

    pip install bs4
    pip install html5lib
    pip install lxml
    

    And restart your Python environment and now it is working.

    0 讨论(0)
提交回复
热议问题