Can i read iframe through WebClient (i want the outer html)?

蓝咒 提交于 2019-12-13 03:09:56

问题


Well my program is reading a web target that somewhere in the body there is the iframe that i want to read.

My html source

<html>
...
<iframe src="http://www.mysite.com" ></iframe>
...
</html>

in my program i have a method that is returning the source as a string

public static string get_url_source(string url)
{
   using (WebClient client = new WebClient())
   {
       return client.DownloadString(url);
   }
}

My problem is that i want to get the source of the iframe when it's reading the source, as it would do in normal browsing.

Can i do this only by using WebBrowser Class or there is a way to do it within WebClient or even another class?

The real question: How can i get the outer html given a url? Any appoach is welcomed.


回答1:


After getting the source of the site, you can use HtmlAgilityPack to get the url of the iframe

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

var src = doc.DocumentNode.SelectSingleNode("//iframe")
            .Attributes["src"].Value;

then make a second call to get_url_source




回答2:


Parse your source using HTML Agility Pack and then:

List<String> iframeSource = new List<String>();

HtmlDocument doc = new HtmlDocument();
doc.Load(url);

foreach (HtmlNode node in doc.DocumentElement.SelectNodes("//iframe"))
    iframeSource.Add(get_url_source(mainiFrame.Attributes["src"]));

If you are targeting a single iframe, try to identify it using ID attribute or something else so you can only retrieve one source:

String iframeSource;

HtmlDocument doc = new HtmlDocument();
doc.Load(url);

foreach (HtmlNode node in doc.DocumentElement.SelectNodes("//iframe"))
{
    // Just an example for check, but you could use different approaches...
    if (node.Attributes["id"].Value == 'targetframe')
        iframeSource = get_url_source(node.Attributes["src"].Value);
}



回答3:


Well i found the answer after some search and this is what i wanted

webBrowser1.Url = new Uri("http://www.mysite.com/");
while (webBrowser1.ReadyState != WebBrowserReadyState.Complete) Application.DoEvents();
string InnerSource = webBrowser1.Document.Body.InnerHtml; 
                            //You can use here OuterHtml too.


来源:https://stackoverflow.com/questions/14429023/can-i-read-iframe-through-webclient-i-want-the-outer-html

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!