I\'m making a request to a remote web server that is currently offline (on purpose).
I\'d like to figure out the best way to time out the request. Basically if the
You could use a standard HttpWebRequest to fetch the remote resource and set the Timeout property. Then feed the resulting HTML if it succeeds to HTML Agility Pack for parsing.
Html Agility Pack is open souce. Thats why you may modify source yurself. For first add this code to class HtmlWeb:
private int _timeout = 20000;
public int Timeout
{
get { return _timeout; }
set
{
if (_timeout < 1)
throw new ArgumentException("Timeout must be greater then zero.");
_timeout = value;
}
}
Then find this method
private HttpStatusCode Get(Uri uri, string method, string path, HtmlDocument doc, IWebProxy proxy, ICredentials creds)
and modify it:
req = WebRequest.Create(uri) as HttpWebRequest;
req.Method = method;
req.UserAgent = UserAgent;
req.Timeout = Timeout; //add this
Or something like that:
htmlWeb.PreRequest = request =>
{
request.Timeout = 15000;
return true;
};
Retrieve your url web page through this method:
private static string retrieveData(string url)
{
// used to build entire input
StringBuilder sb = new StringBuilder();
// used on each read operation
byte[] buf = new byte[8192];
// prepare the web page we will be asking for
HttpWebRequest request = (HttpWebRequest)
WebRequest.Create(url);
request.Timeout = 10; //10 millisecond
// execute the request
HttpWebResponse response = (HttpWebResponse)
request.GetResponse();
// we will read data via the response stream
Stream resStream = response.GetResponseStream();
string tempString = null;
int count = 0;
do
{
// fill the buffer with data
count = resStream.Read(buf, 0, buf.Length);
// make sure we read some data
if (count != 0)
{
// translate from bytes to ASCII text
tempString = Encoding.ASCII.GetString(buf, 0, count);
// continue building the string
sb.Append(tempString);
}
}
while (count > 0); // any more data to read?
return sb.ToString();
}
And to use the HTML Agility pack and retrive the html tag like this:
public static string htmlRetrieveInfo()
{
string htmlSource = retrieveData("http://example.com/test.html");
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(htmlSource);
if (doc.DocumentNode.SelectSingleNode("//body") != null)
{
HtmlNode node = doc.DocumentNode.SelectSingleNode("//body");
}
return node.InnerHtml;
}
I had to make a small adjustment to my originally posted code
public JsonpResult About(string HomePageUrl)
{
Models.Pocos.About about = null;
// ************* CHANGE HERE - added "timeout in milliseconds" to RemoteFileExists extension method.
if (HomePageUrl.RemoteFileExists(1000))
{
// Using the Html Agility Pack, we want to extract only the
// appropriate data from the remote page.
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(HomePageUrl);
HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[@class='wrapper1-border']");
if (node != null)
{
about = new Models.Pocos.About { html = node.InnerHtml };
}
//todo: look into whether this else statement is necessary
else
{
about = null;
}
}
return this.Jsonp(about);
}
Then I modified my RemoteFileExists
extension method to have a timeout
public static bool RemoteFileExists(this string url, int timeout)
{
try
{
//Creating the HttpWebRequest
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
// ************ ADDED HERE
// timeout the request after x milliseconds
request.Timeout = timeout;
// ************
//Setting the Request method HEAD, you can also use GET too.
request.Method = "HEAD";
//Getting the Web Response.
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
//Returns TRUE if the Status code == 200
return (response.StatusCode == HttpStatusCode.OK);
}
catch
{
//Any exception will returns false.
return false;
}
}
In this approach, if my timeout fires before RemoteFileExists
can determine the header response, then my bool
will return false.