Python, Selenium, and Beautiful Soup for URL

后端 未结 1 1120
無奈伤痛
無奈伤痛 2021-01-27 02:21

I am trying to write a script using Selenium to access pastebin do a search and print out in text the URL results. I need the visible URL results and nothing else.



        
相关标签:
1条回答
  • 2021-01-27 03:07

    You don't actually need BeautifulSoup. selenium itself is very powerful at locating element:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.keys import Keys
    
    
    browser = webdriver.Firefox()
    browser.get('http://www.pastebin.com')
    
    search = browser.find_element_by_name('q')
    search.send_keys("test")
    search.send_keys(Keys.RETURN)
    
    # wait for results to appear
    wait = WebDriverWait(browser, 10)
    results = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.gsc-resultsbox-visible")))
    
    # grab results
    for link in results.find_elements_by_css_selector("a.gs-title"):
        print link.get_attribute("href")
    
    browser.close()
    

    Prints:

    http://pastebin.com/VYQTSbzY
    http://pastebin.com/VYQTSbzY
    http://pastebin.com/VAAQCjkj
    ...
    http://pastebin.com/fVUejyRK
    http://pastebin.com/fVUejyRK
    

    Note the use of an Explicit Wait which helps to wait for the search results to appear.

    0 讨论(0)
提交回复
热议问题