问题
I'm trying to parse currencies from this bank website. In code:
import requests
import time
import logging
from retrying import retry
from lxml import html
logging.basicConfig(filename='info.log', format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
@retry(wait_fixed=5000)
def fetch_data_from_nb_ved_ru():
try:
page = requests.get('http://www.nbu.com/exchange_rates')
#print page.text
tree = (html.fromstring(page.text))
#fetched_ved_usd_buy = tree.xpath('//div[@class="exchangeRates"]/table/tbody/tr[5]/td[5]')
fetched_ved_usd_buy = tree.xpath('/html/body/div[1]/div//div[7]/div/div/div[1]//text()')
print fetched_ved_usd_buy
fetched_ved_usd_sell = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[6]/td[6]/text()')).strip()
fetched_ved_eur_buy = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[7]/td[5]/text()')).strip()
fetched_ved_eur_sell = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[7]/td[6]/text()')).strip()
fetched_cb_eur = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[7]/td[4]/text()')).strip()
fetched_cb_rub = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[18]/td[4]/text()')).strip()
fetched_cb_usd = str(tree.xpath('/html/body/div[1]/div/div[7]/div/div/div[1]/table/tbody/tr[6]/td[4]/text()')).strip()
except:
logging.warning("NB VED UZ fetch failed")
raise IOError("NB VED UZ fetch failed")
return fetched_ved_usd_buy, fetched_ved_usd_sell, fetched_cb_usd, fetched_ved_eur_buy, fetched_ved_eur_sell,\
fetched_cb_eur, fetched_cb_rub
while True:
f = open('values_uzb.txt', 'w')
ved_usd_buy, ved_usd_sell, cb_usd, ved_eur_buy, ed_eur_sell, cb_eur, cb_rub = fetch_data_from_nb_ved_ru()
f.write(str(ved_usd_buy)+'\n'+str(ved_usd_sell)+'\n'+str(cb_usd)+'\n'+str(ved_eur_buy)+'\n'+str(ed_eur_sell)+'\n'
+ str(cb_eur)+'\n'+str(cb_rub))
f.close()
time.sleep(120)
But it always returns empty string, however if I do print page.text
, i can see that the values are on their's places.
I got that xpath from firebug. Chrome gives the same xpath.
Tried to construct own xpath
//div[@class="exchangeRates"]/table/tbody/tr[5]/td[5]
but it happens to be not valid to.
Any suggestions? Thanks.
回答1:
I am not certain what you are looking for exactly, but this works:
tree.xpath("/html/body/div[1]/div[7]/div/div/div[1]//text()")
As for starting with the class exchangeRates
, I found by using tree.xpath("//div[@class='exchangeRates']/table")[0].getchildren()
that there is no tbody
child of table
, even though browsers say there is. See this SO question for an explanation. Removing tbody
from your original xpath does work. However, the one you chose (td[5]
) is empty, thus returning []
. Try
tree.xpath("//div[@class='exchangeRates']/table/tr[5]/td[4]//text()")
# ['706.65']
or
tree.xpath("//div[@class='exchangeRates']/table/tr[6]/td[5]//text()")
# ['2638.00']
回答2:
Try with this xpath:
tree.xpath('//div[@class="exchangeRates"]//tr[NUMBER OF TR]/td[5]/text()')
Another thing... I thing if you put this code you will improve your code:
trs = tree.xpath('//div[@class="exchangeRates"]//tr')
for tr in trs:
currency_code = tr.xpath('./td[7]/text()').strip()
if currency_code=='USD':
usd_buy = tr.xpath('./td[5]/text()').strip()
usd_sell = tr.xpath('./td[6]/text()').strip()
usd_cb = tr.xpath('./td[4]/text()').strip()
And continue with other currency that you need.
It is a quickly code, if you need more details reply please.
回答3:
i use this statement. And this statement runs perfectly fine for me. Thanks
ActualValue = driver.find_element_by_xpath("//div/div[2]/div").text
来源:https://stackoverflow.com/questions/32250557/cant-get-text-values-using-xpath-in-python