Detect encoding and make everything UTF-8

前端 未结 24 2400
暗喜
暗喜 2020-11-22 03:03

I\'m reading out lots of texts from various RSS feeds and inserting them into my database.

Of course, there are several different character encodings used in the fee

24条回答
  •  醉酒成梦
    2020-11-22 03:10

    You need to test the character set on input since responses can come coded with different encodings.

    I force all content been sent into UTF-8 by doing detection and translation using the following function:

    function fixRequestCharset()
    {
      $ref = array(&$_GET, &$_POST, &$_REQUEST);
      foreach ($ref as &$var)
      {
        foreach ($var as $key => $val)
        {
          $encoding = mb_detect_encoding($var[$key], mb_detect_order(), true);
          if (!$encoding)
            continue;
          if (strcasecmp($encoding, 'UTF-8') != 0)
          {
            $encoding = iconv($encoding, 'UTF-8', $var[$key]);
            if ($encoding === false)
              continue;
            $var[$key] = $encoding;
          }
        }
      }
    }
    

    That routine will turn all PHP variables that come from the remote host into UTF-8.

    Or ignore the value if the encoding could not be detected or converted.

    You can customize it to your needs.

    Just invoke it before using the variables.

提交回复
热议问题