Selenium WebDriver get_attribute returns truncated value of href attribute when value has entities

吃可爱长大的小学妹 提交于 2020-06-12 06:47:25

问题


I am trying to get href attribute value from anchor tab on a page in my application using selenium Webdriver (Python) and the result returned has part stripped off.

Here is the HTML snippet -

<a class="nla-row-text" href="/shopping/brands?search=kamera&amp;nm=Canon&amp;page=0" data-reactid="790">

Here is the code I am using -

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains

driver = webdriver.Firefox()
driver.get("xxxx")

url_from_attr = driver.find_element(By.XPATH,"(//div[@class='nla-children mfr']/div/div/a)[1]").get_attribute("href")

url_from_attr_raw = "%r"%url_from_attr

print(" URL from attribute -->> " + url_from_attr)
print(" Raw string -->> " + url_from_attr_raw)

The output I am getting is -

/shopping/brands?search=kamera&page=0

instead of -

/shopping/brands?search=kamera&amp;nm=Canon&amp;page=0 OR
/shopping/brands?search=kamera&nm=Canon&page=0

Is this because of the entity representation in the URL as I see part between entities stripped? Any help or pointer would be great


回答1:


As per the given HTML there is a issue with the Locator Strategy which you have tried. You have used an index [1] along with find_element which is error-prone. Index e.g. [1] can be applied when a List is returned through find_elements. In this usecase an optimized expression would be :

url_from_attr = driver.find_element(By.XPATH,"//div[@class='nla-children mfr']/div/div/a[@class='nla-row-text']").get_attribute("href")

The Locator Strategy can be more optimized as follows :

url_from_attr = driver.find_element(By.XPATH,"//div[@class='nla-children mfr']//a[@class='nla-row-text']").get_attribute("href")

Update A

As per your comment as you still need to use indexing the optimized Locator Strategy can be :

url_from_attr = driver.find_elements(By.XPATH,"//div[@class='nla-children mfr']//a[@class='nla-row-text'][1]").get_attribute("href")

get_attribute(attribute_name)

As per the Python-API Source :

    def get_attribute(self, name):
    """Gets the given attribute or property of the element.

    This method will first try to return the value of a property with the
    given name. If a property with that name doesn't exist, it returns the
    value of the attribute with the same name. If there's no attribute with
    that name, ``None`` is returned.

    Values which are considered truthy, that is equals "true" or "false",
    are returned as booleans.  All other non-``None`` values are returned
    as strings.  For attributes or properties which do not exist, ``None``
    is returned.

    :Args:
        - name - Name of the attribute/property to retrieve.

    Example::

        # Check if the "active" CSS class is applied to an element.
        is_active = "active" in target_element.get_attribute("class")

    """

    attributeValue = ''
    if self._w3c:
        attributeValue = self.parent.execute_script(
        "return (%s).apply(null, arguments);" % getAttribute_js,
        self, name)
    else:
        resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
        attributeValue = resp.get('value')
        if attributeValue is not None:
        if name != 'value' and attributeValue.lower() in ('true', 'false'):
            attributeValue = attributeValue.lower()
    return attributeValue   

Update B

As you mentioned in your comment the url value being returned by the method is not present anywhere on the page which implies that you are trying to access the href attribute too early. So there can be 2 solutions as follows :

  • Traverse the DOM Tree and construct a Locator which will uniquely identify the element and induce WebDriverwait with expected_conditions as element_to_be_clickable and then extract the href attribute.

  • For debugging purpose you can add time.sleep(10) for the element to get rendered properly in the HTML DOM and then try to extract the href attribute.



来源:https://stackoverflow.com/questions/48923396/selenium-webdriver-get-attribute-returns-truncated-value-of-href-attribute-when

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!