Web scraping with selenium and python - xpath with contains text

こ雲淡風輕ζ 提交于 2021-01-28 02:00:22

问题


I will try to make it really short. I am trying to click on a product that came out of a search from a website. Basically there is a list of matching products, and I want to click on the first one which contains the product name I searched in its title. I will post the link of the website so you can inspect its DOM structure: https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and In this case, many contain my query string, and I would simply like to click on the first one.

Here is the snippet of code I wrote for this:

def click_on_first_matching_product(self):
        first_product = WebDriverWait(self.driver, 6).until(
            EC.visibility_of_all_elements_located((By.XPATH, f"//a[@class='df-card__main']/div/div[@class=df-card__title] and contains(text(), '{self.product_code}')"))
        )[0]
        first_product.click()

The problem is that 6 seconds go by and it cant find an element that satisfies the xPath condition i wrote, but I cant figure out how to make it work. I am trying to get a search result a element and check if the title it has down its structure contains the query string I searched. Can I have some help and an explanation please? I am quite new to selenium and XPaths...

Can I please also have a link to a reliable selenium documentation? I am having some hard times trying to find a good one. Maybe one that also explains how to make conditions for xPaths please.


回答1:


You need to consider a couple of things. Your use-case would be either to click on the first search result or to click on the item with respect to the card title. In case of clicking on a definite WebElement inducing WebDriverWait for visibility_of_all_elements_located() will be too expensive.


To click on the item with respect to the card title you have to induce WebDriverWait for the element_to_be_clickable() and you can use the following xpath based Locator Strategies:

  • Using the text CE285A Toner Compatibile Per Hp LaserJet P1102 directly:

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[text()='CE285A Toner Compatibile Per Hp LaserJet P1102']"))).click()
    
  • Using a variable for the text through format():

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    text = "CE285A Toner Compatibile Per Hp LaserJet P1102"
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[text()='{}']".format(text)))).click()
    
  • Using a variable for the text through %s:

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    text = "CE285A Toner Compatibile Per Hp LaserJet P1102"
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[text()='%s']"% str(text)))).click()
    

To click on first search product you have to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:

  • CSS_SELECTOR:

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.df-card>a"))).click()
    
  • XPATH:

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='df-card']/a"))).click()
    

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC



回答2:


Your xpath seems incorrect.Try following xpath to click on product.

driver.get("https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and")
def click_on_first_matching_product(product_code):
    first_product = WebDriverWait(driver, 6).until(EC.visibility_of_all_elements_located((By.XPATH,"//div[@class='df-card__title' and contains(text(), '{}')]".format(product_code))))[0]
    first_product.click()
click_on_first_matching_product("CE285A")


来源:https://stackoverflow.com/questions/62553465/web-scraping-with-selenium-and-python-xpath-with-contains-text

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!