I\'m reading out lots of texts from various RSS feeds and inserting them into my database.
Of course, there are several different character encodings used in the fee
You need to test the character set on input since responses can come coded with different encodings.
I force all content been sent into UTF-8 by doing detection and translation using the following function:
function fixRequestCharset()
{
$ref = array(&$_GET, &$_POST, &$_REQUEST);
foreach ($ref as &$var)
{
foreach ($var as $key => $val)
{
$encoding = mb_detect_encoding($var[$key], mb_detect_order(), true);
if (!$encoding)
continue;
if (strcasecmp($encoding, 'UTF-8') != 0)
{
$encoding = iconv($encoding, 'UTF-8', $var[$key]);
if ($encoding === false)
continue;
$var[$key] = $encoding;
}
}
}
}
That routine will turn all PHP variables that come from the remote host into UTF-8.
Or ignore the value if the encoding could not be detected or converted.
You can customize it to your needs.
Just invoke it before using the variables.