How to get a specific frame in a web page and retrieve its content

别等时光非礼了梦想. 提交于 2019-12-21 06:25:58

问题


I wanted to access the translation results of the following url

http://translate.google.com/translate?hl=en&sl=en&tl=ar&u=http%3A%2F%2Fwww.saltycrane.com%2Fblog%2F2008%2F10%2Fhow-escape-percent-encode-url-python%2F

the translation is displayed in the bottom content frame out of the two frames. I am interested in retrieving only the bottom content frame to get the translations

selenium for python allows us to fetch page contents via web automation:

browser.get('http://translate.google.com/#en/ar/'+hurl)

The required frame is an iframe :

<div id="contentframe" style="top:160px"><iframe   src="/translate_p?hl=en&am... name=c frameborder="0" style="height:100%;width:100%;position:absolute;top:0px;bottom:0px;"></div></iframe>

but how to get the bottom content frame element to retrieve the translations using web automation?

Came to know that PyQuery also allows us to browse the contents using the JQuery formalism

Update:

An answer mentioned that Selenium provides a method where you can do that.

frame = browser.find_element_by_tag_name('iframe')
browser.switch_to_frame(frame)
# get page source
browser.page_source

but it does not work in the above example. It returns an empty page .


回答1:


You can use driver.switchTo.frame(1); here, the digit 1 inside frame() is the index of frames present in the webpage. as your requirement is to switch to second frame and the index starts with 0, you should use driver.switchTo.frame(1);

But the above code is in Java. In Python, you can use the below line.

driver.switch_to_frame(1);

UPDATE

 driver.get("http://translate.google.com/translate?hl=en&sl=en&tl=ar&u=http://www.saltycrane.com/blog/2008/10/how-escape-percent-encode-url-python/");
 driver.switchTo().frame(0);
 System.out.println(driver.findElement(By.xpath("/html/body/div/div/div[3]/h1/span/a")).getText());

Output: SaltyCrane ???????

I have just tried to print the title name SaltCrane that is present inside the iframe. It worked for me except for the ? symbols after the SaltCrane. As it was arabic, it was unable to decode the same.

The above code is in Java. Same logic should also work in Python.




回答2:


Selenium provides a method where you can do that.

frame = browser.find_element_by_tag_name('iframe')
browser.switch_to_frame(frame)
# get page source
browser.page_source


来源:https://stackoverflow.com/questions/15785920/how-to-get-a-specific-frame-in-a-web-page-and-retrieve-its-content

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!