Pandas: read_html

后端 未结 5 1588
日久生厌
日久生厌 2021-02-18 13:09

I\'m trying to extract US states from wiki URL, and for which I\'m using Python Pandas.

import pandas as pd
import html5lib
f_states = pd.read_html(\'https://si         


        
相关标签:
5条回答
  • 2021-02-18 14:02

    Running Python 3.4 on a mac

    New pyvenv

    pip install pandas
    pip install lxml
    pip install html5lib
    pip install BeautifulSoup4
    

    Then run your example and it should work:

    import pandas as pd
    import html5lib
    f_states=   pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states') 
    
    0 讨论(0)
  • 2021-02-18 14:02

    Also consider conda installing your required packages at https://www.continuum.io/downloads. Instead of pip installing, you would conda install your packages.

    $ conda install html5lib 
    
    0 讨论(0)
  • 2021-02-18 14:05

    You need to install lxml using pip.

    pip install lxml
    

    this worked for me.

    0 讨论(0)
  • 2021-02-18 14:06

    For that you just need to install

    pip install pandas
    pip install lxml
    

    and then you have to import those and run your program

    import pandas as pd
    f_states=pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states') 
    
    0 讨论(0)
  • 2021-02-18 14:09

    if your environment is Anaconda Jupiter notebook.

    you need another set of install comment:

    conda install lxml
    conda install html5lib
    conda install BeautifulSoup4
    

    then run the python code in Jupiter notebook.

    import pandas as pd
    f_states=   pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states') 
    
    0 讨论(0)
提交回复
热议问题