发表新帖

发表新帖

CURLOPT_FOLLOWLOCATION not working

后端未结

关注

 2  1873

I\'m trying to scrape the data at this link: http://www.treasurydirect.gov/NP/BPDLogin?application=np

which contains a meta refresh.

I\'m using curl_exec with CU

相关标签:

2条回答

故里飘歌

2021-01-21 05:42

Meta refreshes are instructions for a browser. Curl doesn't process these. CURLOPT_FOLLOWLOCATION is meant for following redirects.

0 讨论(0)
发布评论:

提交评论
- 加载中...
自闭症患者

2021-01-21 05:44
The problem is not the meta refresh tag (which by the way never will be followed by CURLOPT_FOLLOWLOCATION option) but the HTTP user agent header. The web site checks the HTTP user agent header field against a list of accepted user agents. You could solve this by adding the following line when setting options for $ch:
```
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题