I wrote this method to check if a page exists or not:
protected bool PageExists(string url)
{
try
{
Uri u = new Uri(url);
WebRequest w = WebR
One obvious speedup is to run several requests in parallel - most of the time will be spent on IO, so spawning 10 threads to each check a page will complete the whole iteration around 10 times faster.
I simply used Fredrik Mörk answer above but placed it within a method:
private bool checkURL(string url)
{
bool pageExists = false;
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Method = WebRequestMethods.Http.Head;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
pageExists = response.StatusCode == HttpStatusCode.OK;
}
catch (Exception e)
{
//Do what ever you want when its no working...
//Response.Write( e.ToString());
}
return pageExists;
}
.Proxy
property: w.Proxy = null;
Without this, at least 1st request is much slower, at least on my machine.
3. You can not download whole page, but download only header, by setting .Method to "HEAD".
static bool GetCheck(string address)
{
try
{
HttpWebRequest request = WebRequest.Create(address) as HttpWebRequest;
request.Method = "GET";
request.CachePolicy = new RequestCachePolicy(RequestCacheLevel.NoCacheNoStore);
var response = request.GetResponse();
return (response.Headers.Count > 0);
}
catch
{
return false;
}
}
static bool HeadCheck(string address)
{
try
{
HttpWebRequest request = WebRequest.Create(address) as HttpWebRequest;
request.Method = "HEAD";
request.CachePolicy = new RequestCachePolicy(RequestCacheLevel.NoCacheNoStore);
var response = request.GetResponse();
return (response.Headers.Count > 0);
}
catch
{
return false;
}
}
Beware, certain pages (eg. WCF .svc files) may not return anything from a head request. I know because I'm working around this right now.
EDIT - I know there are better ways to check the return data than counting headers, but this is a copy/paste from stuff where this is important to us.
I think your approach is rather good, but would change it into only downloading the headers by adding w.Method = WebRequestMethods.Http.Head;
before calling GetResponse
.
This could do it:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.example.com");
request.Method = WebRequestMethods.Http.Head;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
bool pageExists = response.StatusCode == HttpStatusCode.OK;
You may probably want to check for other status codes as well.