cUrl a domain without http://www

前端 未结 1 1142
耶瑟儿~
耶瑟儿~ 2021-01-22 03:44

Hi i have a domain i\'d like to parse with cUrl and here is the case:

When i go on domain http://register.metsad.ee/avalik/info_teatis.php?too_id=2942704201

it r

相关标签:
1条回答
  • 2021-01-22 04:05

    There is some kind of protection on register.metsad.ee side. Thay return empty response until User-Agent header is set.

    Failed call (empty response):

    feedbee@server:~$ telnet register.metsad.ee 80
    Trying 213.184.43.115...
    Connected to register.metsad.ee.
    Escape character is '^]'.
    GET /avalik/info_teatis.php?too_id=2942704201 HTTP/1.1
    Host: register.metsad.ee
    
    HTTP/1.1 200 OK
    Date: Thu, 13 Dec 2012 20:07:11 GMT
    Server: Apache
    Content-Length: 0
    Content-Type: text/html; charset=UTF-8
    

    Successfull call (HTML page returned):

    feedbee@server:~$ telnet register.metsad.ee 80
    GET http://register.metsad.ee/avalik/info_teatis.php?too_id=2942704201 HTTP/1.1
    Host: register.metsad.ee
    User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20100101 Firefox/12.0
    
    HTTP/1.1 200 OK
    Date: Thu, 13 Dec 2012 20:13:07 GMT
    Server: Apache
    Expires: Thu, 19 Nov 1981 08:52:00 GMT
    Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
    Pragma: no-cache
    Set-Cookie: SNS=a0e425c2aec17c38be3716b366f75749; path=/
    Transfer-Encoding: chunked
    Content-Type: text/html; charset=UTF-8
    
    762
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    ...
    

    So you need to add the next line to:

    curl_setopt($ch, So you need to add CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20100101 Firefox/12.0"); for example (or any other user agent string).
    
    0 讨论(0)
提交回复
热议问题