问题
I am trying to extract chinese words off a website.
I am using simple cURL code:
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($curl);
echo $response;
Expected result for one of words is
网络频率
However I get this:
ÍøÂçƵÂÊ
Also if I url encode word result is different.
I am having problems with encoding lately. Chinese chars are UTF8 or what? Could anyone help me chars would show "normal" with echo and if I url encode them result will be same as if I copy them off website.
Cheers
回答1:
Chinese is usually UTF-8, yes. The problem you're having is probably not that the data isn't received correctly (cURL knows what it's doing), but that you're not sending them correctly to the browser.
Try this on top of your page:
header('Content-Type: text/html; charset=utf-8');
This will tell the browser that you are sending UTF-8 information.
Update: if this doesn't work, it could be that PHP itself isn't handling them properly. Try playing with utf8_encode
and utf8_decode
a bit in your echo
. If thàt doesn't work, then cURL isn't decoding the stream properly, which means you'll have to look for the Content-Type
header in the response and decode the stream accordingly.
回答2:
Try this,
1) create a new document and make sure the document is UTF-8 compatible
2) Use metal tag :
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
3) I wouldn't recommend forcing header into using utf-8, but simply use ini_set
ini_set('default_charset', 'UTF-8');
if you are calling curl function from a different page, make sure that page is able to carry UTF-8 characters and pass it onto UTF-8 compatible page.
来源:https://stackoverflow.com/questions/8548932/chinese-chars-php-encoding