We\'re using a curl HEAD request in a PHP application to verify the validity of generic links. We check the status code just to make sure that the link the user has entered
Proxy would work, but I think there's another way around it. I see that from AWS and other clouds that it's blocked by IP. I can issue the request from my machine and it works just fine.
I did notice that in the response from the cloud service that it returns some JS that the browser has to execute to take you to a login page. Once there, you can login and access the page. The login page is only for those accessing via a blocked IP.
If you use a headless client that executes JS, or maybe go straight to the subsequent link and provide the credentials of a linkedin user, you may be able to bypass it.
It looks like they filter requests based on the user-agent:
$ curl -I --url https://www.linkedin.com/company/linkedin | grep HTTP
HTTP/1.1 999 Request denied
$ curl -A "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" -I --url https://www.linkedin.com/company/linkedin | grep HTTP
HTTP/1.1 200 OK
I found the workaround, important to set accept-encoding header:
curl --url "https://www.linkedin.com/in/izman" \
--header "user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36" \
--header "accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" \
--header "accept-encoding:gzip, deflate, sdch, br" \
| gunzip
Seems like LinkedIn filter both user agent AND ip address. I tried this both at home and from an Digital Ocean node:
curl -A "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" -I --url https://www.linkedin.com/company/linkedin
From home I got a 200 OK, from DO I got 999 Denied...
So you need a proxy service like HideMyAss or other (haven't tested it so I couldn't say if it's valid or not). Here is a good comparison of proxy services.
Or you could setup a proxy on your home network, for example use a Raspberry PI to proxy your requests. Here is a guide on that.