I\'m trying to scrape this website for the list of company names, code, industry, sector, mkt cap, etc
in the table with selenium. I\'m new to it and have writt
This is totally do-able. What might be the easiest is to use a 'find_elements' call (note that it's plural) and grab all of the <tr>
elements. It will return a list that you can parse using find element (singular) calls on each one in the list, but this time find each element by class.
You may be running into a timing issue. I noticed that the data you are looking for loads VERY slowly. You probably need to wait for that data. The best way to do that will be to check for its existence until it appears, then try to load it. Find elements calls (again, note that I'm using the plural again) will not throw an exception when looking for elements and finding none, it will just return an empty list. This is a decent way to check for the data to appear.
The results are in an iframe - switch to it and then get the .page_source
:
iframe = driver.find_element_by_css_selector("#mainContent iframe")
driver.switch_to.frame(iframe)
I would also add a wait for the table to be loaded:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
# locate and switch to the iframe
iframe = driver.find_element_by_css_selector("#mainContent iframe")
driver.switch_to.frame(iframe)
# wait for the table to load
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.companyName')))
print(driver.page_source)