Before starting writing this question, i was trying to solve following
// 1. navigate to page
// 2. wait until page is downloaded
// 3. read and write some d
Unlike Thorsten I didn't have to use ShDocVw, but what did make the difference for me was adding the loop checking ReadyState and using Application.DoEvents() while not ready. Here is my code:
this.webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(WebBrowser_DocumentCompleted);
foreach (var item in this.urlList) // This is a Dictionary<string, string>
{
this.webBrowser.Navigate(item.Value);
while (this.webBrowser1.ReadyState != WebBrowserReadyState.Complete)
{
Application.DoEvents();
}
}
And I used Yuki's solution for checking the results of WebBrowser_DocumentCompleted, though with the last if/else swapped per user's comment:
private void WebBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
string url = e.Url.ToString();
var browser = (WebBrowser)sender;
if (!(url.StartsWith("http://") || url.StartsWith("https://")))
{
// in AJAX
}
if (e.Url.AbsolutePath != this.webBrowser.Url.AbsolutePath)
{
// IFRAME
}
else
{
// REAL DOCUMENT COMPLETE
// Put my code here
}
}
Worked like a charm :)
I had to do something similar. What I do is use ShDocVw directly (adding a reference to all the necessary interop assemblies to my project). Then, I do not add the WebBrowser control to my form, but the AXShDocVw.AxWebBrowser control.
To navigate and wait I use to following method:
private void GotoUrlAndWait(AxWebBrowser wb, string url)
{
object dummy = null;
wb.Navigate(url, ref dummy, ref dummy, ref dummy, ref dummy);
// Wait for the control the be initialized and ready.
while (wb.ReadyState != SHDocVw.tagREADYSTATE.READYSTATE_COMPLETE)
Application.DoEvents();
}
You might want to know the AJAX calls as well.
Consider using this:
private void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
string url = e.Url.ToString();
if (!(url.StartsWith("http://") || url.StartsWith("https://")))
{
// in AJAX
}
if (e.Url.AbsolutePath != this.webBrowser.Url.AbsolutePath)
{
// IFRAME
}
else
{
// REAL DOCUMENT COMPLETE
}
}
Just thought to drop a line or two here about a small improvement which works in conjunction with the code of FeiBao. The idea is to inject a landmark (javascript) variable in the webpage and use that to detect which of the subsequent DocumentComplete events is the real deal. I doubt it's bulletproof but it has worked more reliably in general than the approach that lacks it. Any comments welcome. Here is the boilerplate code:
void WebBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
string url = e.Url.ToString();
var browser = (WebBrowser)sender;
if (!(url.StartsWith("http://") || url.StartsWith("https://")))
{
// in AJAX
}
if (e.Url.AbsolutePath != this.webBrowser.Url.AbsolutePath)
{
// IFRAME
}
else if (browser.Document != null && (bool)browser.Document.InvokeScript("eval", new object[] { @"typeof window.YourLandMarkJavascriptVariableHere === 'undefined'" }))
{
((IHTMLWindow2)browser.Document.Window.DomWindow).execScript("var window.YourLandMarkJavascriptVariableHere = true;");
// REAL DOCUMENT COMPLETE
// Put my code here
}
}
I have yet to find a working solution to this problem online. Hopefully this will make it to the top and save everyone the months of tweaking I spent trying to solve it, and the edge cases associated with it. I have fought over this issue over the years as Microsoft has changed the implementation/reliability of isBusy and document.readystate. With IE8, I had to resort to the following solution. It's similar to the question/answer from Margus with a few exceptions. My code will handle nested frames, javascript/ajax requests and meta-redirects. I have simplified the code for clarity sake, but I also use a timeout function (not included) to reset the webpage after if 5 minutes domAccess still equals false.
private void m_WebBrowser_BeforeNavigate(object pDisp, ref object URL, ref object Flags, ref object TargetFrameName, ref object PostData, ref object Headers, ref bool Cancel)
{
//Javascript Events Trigger a Before Navigate Twice, but the first event
//will contain javascript: in the URL so we can ignore it.
if (!URL.ToString().ToUpper().StartsWith("JAVASCRIPT:"))
{
//indicate the dom is not available
this.domAccess = false;
this.activeRequests.Add(URL);
}
}
private void m_WebBrowser_DocumentComplete(object pDisp, ref object URL)
{
this.activeRequests.RemoveAt(0);
//if pDisp Matches the main activex instance then we are done.
if (pDisp.Equals((SHDocVw.WebBrowser)m_WebBrowser.ActiveXInstance))
{
//Top Window has finished rendering
//Since it will always render last, clear the active requests.
//This solves Meta Redirects causing out of sync request counts
this.activeRequests.Clear();
}
else if (m_WebBrowser.Document != null)
{
//Some iframe completed dom render
}
//Record the final complete URL for reference
if (this.activeRequests.Count == 0)
{
//Finished downloading page - dom access ready
this.domAccess = true;
}
}