I\'m trying to get Thai characters from a website. I\'ve tried:
$rawChapter = file_get_contents(\"URL\");
$rawChapter = mb_convert_encoding($rawChapter, \'UT
Change your Accept-Charset
to UTF-8
because ISO-8859-1 does not support Thai characters. If you are running your PHP script on a windows machine, you may also use the windows-874
charset, and you may also try adding this header :
Content-Language: th
But in most cases, UTF-8 will handle pretty much most characters or character sets without any other declaration.
** UPDATE **
Very strange, but this works for me.
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=> implode("\r\n", array(
'Content-type: text/plain; charset=TIS-620'
//'Content-type: text/plain; charset=windows-874' // same thing
))
)
);
$context = stream_context_create($opts);
//$fp = fopen('http://thaipope.org/webbible/01_002.htm', 'rb', false, $context);
//$contents = stream_get_contents($fp);
//fclose($fp);
$contents = file_get_contents("http://thaipope.org/webbible/01_002.htm",false, $context);
header('Content-type: text/html; charset=TIS-620');
//header('Content-type: text/html; charset=windows-874'); // same thing
echo $contents;
Apparently, I was wrong for this one about UTF-8. See here for more details. Though you can still have an UTF-8 output :
$in_charset = 'TIS-620'; // == 'windows-874'
$out_charset = 'utf-8';
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=> implode("\r\n", array(
'Content-type: text/plain; charset=' . $in_charset
))
)
);
$context = stream_context_create($opts);
$contents = file_get_contents("http://thaipope.org/webbible/01_002.htm",false, $context);
if ($in_charset != $out_charset) {
$contents = iconv($in_charset, $out_charset, $contents);
}
header('Content-type: text/html; charset=' . $out_charset);
echo $contents; // output in UTF-8