file_get_contents not working with utf8

后端 未结 1 1978
梦如初夏
梦如初夏 2020-12-22 03:39

I\'m trying to get Thai characters from a website. I\'ve tried:

$rawChapter = file_get_contents(\"URL\");
$rawChapter = mb_convert_encoding($rawChapter, \'UT         


        
1条回答
  •  隐瞒了意图╮
    2020-12-22 04:16

    Change your Accept-Charset to UTF-8 because ISO-8859-1 does not support Thai characters. If you are running your PHP script on a windows machine, you may also use the windows-874 charset, and you may also try adding this header :

    Content-Language: th
    

    But in most cases, UTF-8 will handle pretty much most characters or character sets without any other declaration.

    ** UPDATE **

    Very strange, but this works for me.

    $opts = array(
      'http'=>array(
        'method'=>"GET",
        'header'=> implode("\r\n", array(
                       'Content-type: text/plain; charset=TIS-620'
                       //'Content-type: text/plain; charset=windows-874'  // same thing
                    ))
      )
    );
    
    $context = stream_context_create($opts);
    
    //$fp = fopen('http://thaipope.org/webbible/01_002.htm', 'rb', false, $context);
    //$contents = stream_get_contents($fp);
    //fclose($fp);
    $contents = file_get_contents("http://thaipope.org/webbible/01_002.htm",false, $context);
    
    header('Content-type: text/html; charset=TIS-620');
    //header('Content-type: text/html; charset=windows-874');  // same thing
    
    echo $contents;
    

    Apparently, I was wrong for this one about UTF-8. See here for more details. Though you can still have an UTF-8 output :

    $in_charset = 'TIS-620';   // == 'windows-874'
    $out_charset = 'utf-8';
    
    $opts = array(
      'http'=>array(
        'method'=>"GET",
        'header'=> implode("\r\n", array(
                       'Content-type: text/plain; charset=' . $in_charset
                    ))
      )
    );
    
    $context = stream_context_create($opts);
    
    $contents = file_get_contents("http://thaipope.org/webbible/01_002.htm",false, $context);
    if ($in_charset != $out_charset) {
        $contents = iconv($in_charset, $out_charset, $contents);
    }
    
    header('Content-type: text/html; charset=' . $out_charset);
    
    echo $contents;   // output in UTF-8
    

    0 讨论(0)
提交回复
热议问题