RCurl::url.exists() : how to get non-error for redirects (in the 300 range of HTTP status codes)
问题 I have a bunch of URLs extracted by text-mining some PDF documents. Now I want to test the URLS for validity. Some urls have junk characters inside or appended, or the URLS are truncated. One approach is to filter them by calling each of them. To do that, I use the url.exists() function from the RCurl package. The function makes HTTP HEAD requests to urls using curl and checks the status code. From the documentation of ?url.exists If ‘.header’ is ‘FALSE’, this returns ‘TRUE’ or ‘FALSE’