Scrape tables with python

前端 未结 2 1107
没有蜡笔的小新
没有蜡笔的小新 2021-01-14 23:23

I am trying to scrape tables and convert them into data.tables in python, but I have little luck of election data in USA. This is html of the data I want to scrape.

2条回答
  •  旧巷少年郎
    2021-01-14 23:58

    So after some time I managed to scrape all data from this website. So the main problem was, that website was embedded in JavaScript, so I could not scrape with Beautifulsoup. So I used selenium + beautifulsoup4, to convert page into html and scrape it.

    from selenium import webdriver
    import time
    import os
    from bs4 import BeautifulSoup
    chrome_path = r"C:\Users\Desktop\chromedriver_win32\chromedriver.exe"
    driver = webdriver.Chrome(chrome_path)
    driver.get('http://www.politico.com/2016-election/primary/results/map/president/arizona/')
    time.sleep(80)
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(5)
    html = driver.page_source
    soup = BeautifulSoup(html,'html.parser')
    for posts in soup.findAll('table',{'class':'results-table'}):
    for tr in posts.findAll('tr'):
        popular = [td for td in tr.stripped_strings]
        print(popular)
    

    Because it is dynamic webpage, I needed to simulate some things with selenium. Like scrolling page down. I used time.sleep(60) so the page could load. It loads really slowly, so I set time to 60s. Hope it helps someone.

提交回复
热议问题