问题
I am trying to get href attribute value from anchor tab on a page in my application using selenium Webdriver (Python) and the result returned has part stripped off.
Here is the HTML snippet -
<a class="nla-row-text" href="/shopping/brands?search=kamera&nm=Canon&page=0" data-reactid="790">
Here is the code I am using -
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
driver = webdriver.Firefox()
driver.get("xxxx")
url_from_attr = driver.find_element(By.XPATH,"(//div[@class='nla-children mfr']/div/div/a)[1]").get_attribute("href")
url_from_attr_raw = "%r"%url_from_attr
print(" URL from attribute -->> " + url_from_attr)
print(" Raw string -->> " + url_from_attr_raw)
The output I am getting is -
/shopping/brands?search=kamera&page=0
instead of -
/shopping/brands?search=kamera&nm=Canon&page=0 OR
/shopping/brands?search=kamera&nm=Canon&page=0
Is this because of the entity representation in the URL as I see part between entities stripped? Any help or pointer would be great
回答1:
As per the given HTML there is a issue with the Locator Strategy which you have tried. You have used an index [1]
along with find_element
which is error-prone. Index e.g. [1]
can be applied when a List is returned through find_elements
. In this usecase an optimized expression would be :
url_from_attr = driver.find_element(By.XPATH,"//div[@class='nla-children mfr']/div/div/a[@class='nla-row-text']").get_attribute("href")
The Locator Strategy can be more optimized as follows :
url_from_attr = driver.find_element(By.XPATH,"//div[@class='nla-children mfr']//a[@class='nla-row-text']").get_attribute("href")
Update A
As per your comment as you still need to use indexing the optimized Locator Strategy can be :
url_from_attr = driver.find_elements(By.XPATH,"//div[@class='nla-children mfr']//a[@class='nla-row-text'][1]").get_attribute("href")
get_attribute(attribute_name)
As per the Python-API Source :
def get_attribute(self, name):
"""Gets the given attribute or property of the element.
This method will first try to return the value of a property with the
given name. If a property with that name doesn't exist, it returns the
value of the attribute with the same name. If there's no attribute with
that name, ``None`` is returned.
Values which are considered truthy, that is equals "true" or "false",
are returned as booleans. All other non-``None`` values are returned
as strings. For attributes or properties which do not exist, ``None``
is returned.
:Args:
- name - Name of the attribute/property to retrieve.
Example::
# Check if the "active" CSS class is applied to an element.
is_active = "active" in target_element.get_attribute("class")
"""
attributeValue = ''
if self._w3c:
attributeValue = self.parent.execute_script(
"return (%s).apply(null, arguments);" % getAttribute_js,
self, name)
else:
resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
attributeValue = resp.get('value')
if attributeValue is not None:
if name != 'value' and attributeValue.lower() in ('true', 'false'):
attributeValue = attributeValue.lower()
return attributeValue
Update B
As you mentioned in your comment the url value being returned by the method is not present anywhere on the page which implies that you are trying to access the href attribute too early. So there can be 2 solutions as follows :
Traverse the DOM Tree and construct a Locator which will uniquely identify the element and induce WebDriverwait with expected_conditions as element_to_be_clickable and then extract the href attribute.
For debugging purpose you can add
time.sleep(10)
for the element to get rendered properly in the HTML DOM and then try to extract the href attribute.
来源:https://stackoverflow.com/questions/48923396/selenium-webdriver-get-attribute-returns-truncated-value-of-href-attribute-when