How to re-install lxml?

后端 未结 4 1020
一个人的身影
一个人的身影 2020-12-09 16:45

Python version and Device used

  • Python 2,7.5
  • Mac 10.7.5
  • BeautifulSoup 4.2.1.

I\'m following the BeautifulSoup tu

相关标签:
4条回答
  • 2020-12-09 17:10

    I am using BeautifulSoup 4.3.2 and OS X 10.6.8. I also have a problem with improperly installed lxml. Here are some things that I found out:

    First of all, check this related question: Removed MacPorts, now Python is broken

    Now, in order to check which builders for BeautifulSoup 4 are installed, try

    >>> import bs4
    >>> bs4.builder.builder_registry.builders
    

    If you don't see your favorite builder, then it is not installed, and you will see an error as above ("Couldn't find a tree builder...").

    Also, just because you can import lxml, doesn't mean that everything is perfect.

    Try

    >>> import lxml
    >>> import lxml.etree
    

    To understand what's going on, go to the bs4 installation and open the egg (tar -xvzf). Notice the modules bs4.builder. Inside it you should see files such as _lxml.py and _html5lib.py. So you can also try

    >>> import bs4.builder.htmlparser
    >>> import bs4.builder._lxml
    >>> import bs4.builder._html5lib
    

    If there is a problem, you will see, why a parricular module cannot be loaded. You can notice how at the end of builder/__init__.py it loads all those modules and ignores whatever was not loaded:

    # Builders are registered in reverse order of priority, so that custom
    # builder registrations will take precedence. In general, we want lxml
    # to take precedence over html5lib, because it's faster. And we only
    # want to use HTMLParser as a last result.
    from . import _htmlparser
    register_treebuilders_from(_htmlparser)
    try:
        from . import _html5lib
        register_treebuilders_from(_html5lib)
    except ImportError:
        # They don't have html5lib installed.
        pass
    try:
        from . import _lxml
        register_treebuilders_from(_lxml)
    except ImportError:
        # They don't have lxml installed.
        pass
    
    0 讨论(0)
  • 2020-12-09 17:18

    FWIW, I ran into a similar problem (python 3.6, os x 10.12.6) and was able to solve it simply by doing (first command is just to signify that I was working in a conda virtualenv):

    $ source activate ml-general
    $ pip uninstall lxml
    $ pip install lxml
    

    I tried more complicated things first, because BeautifulSoup was working correctly with an identical command through Jupyter+iPython, but not through PyCharm's terminal in the same virtualenv. Simply reinstalling lxml as above solved the problem.

    0 讨论(0)
  • 2020-12-09 17:25

    If you are using Python2.7 in Ubuntu/Debian, this worked for me:

    $ sudo apt-get build-dep python-lxml
    $ sudo pip install lxml 
    

    Test it like:

    mona@pascal:~/computer_vision/image_retrieval$ python
    Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
    [GCC 4.8.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import lxml
    
    0 讨论(0)
  • 2020-12-09 17:28

    apt-get on Debian/Ubuntu: sudo apt-get install python3-lxml For MacOS-X, a macport of lxml is available. Try something like sudo port install py27-lxml

    http://lxml.de/installation.html may be helpful.

    0 讨论(0)
提交回复
热议问题