Web scraping with Selenium

前端 未结 2 885
独厮守ぢ
独厮守ぢ 2021-01-07 08:55

I\'m trying to scrape this website for the list of company names, code, industry, sector, mkt cap, etc in the table with selenium. I\'m new to it and have writt

相关标签:
2条回答
  • 2021-01-07 09:29

    This is totally do-able. What might be the easiest is to use a 'find_elements' call (note that it's plural) and grab all of the <tr> elements. It will return a list that you can parse using find element (singular) calls on each one in the list, but this time find each element by class.

    You may be running into a timing issue. I noticed that the data you are looking for loads VERY slowly. You probably need to wait for that data. The best way to do that will be to check for its existence until it appears, then try to load it. Find elements calls (again, note that I'm using the plural again) will not throw an exception when looking for elements and finding none, it will just return an empty list. This is a decent way to check for the data to appear.

    0 讨论(0)
  • 2021-01-07 09:39

    The results are in an iframe - switch to it and then get the .page_source:

    iframe = driver.find_element_by_css_selector("#mainContent iframe")
    driver.switch_to.frame(iframe)
    

    I would also add a wait for the table to be loaded:

    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    wait = WebDriverWait(driver, 10)
    
    # locate and switch to the iframe
    iframe = driver.find_element_by_css_selector("#mainContent iframe")
    driver.switch_to.frame(iframe)
    
    # wait for the table to load
    wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.companyName')))
    
    print(driver.page_source)
    
    0 讨论(0)
提交回复
热议问题