Some of my script are using different encoding, and when I try to combine them, this has becom an issue.
But I can't change the encoding they use, instead I want to change the encodig of the result from script A, and use it as parameter in script B.
So: is there any simple way to change a string from UTF-8 to ISO-88591 in PHP? I have looked at utf_encode and _decode, but they doesn't do what i want. Why doesn't there exsist any "utf2iso()"-function, or similar?
I don't think I have characters that can't be written in ISO-format, so that shouldn't be an huge issue.
Have a look at iconv()
or mb_convert_encoding()
.
Just by the way: why don't utf8_encode()
and utf8_decode()
work for you?
utf8_decode — Converts a string with ISO-8859-1 characters encoded with UTF-8 to single-byte ISO-8859-1
utf8_encode — Encodes an ISO-8859-1 string to UTF-8
So essentially
$utf8 = 'ÄÖÜ'; // file must be UTF-8 encoded
$iso88591_1 = utf8_decode($utf8);
$iso88591_2 = iconv('UTF-8', 'ISO-8859-1', $utf8);
$iso88591_2 = mb_convert_encoding($utf8, 'ISO-8859-1', 'UTF-8');
$iso88591 = 'ÄÖÜ'; // file must be ISO-8859-1 encoded
$utf8_1 = utf8_encode($iso88591);
$utf8_2 = iconv('ISO-8859-1', 'UTF-8', $iso88591);
$utf8_2 = mb_convert_encoding($iso88591, 'UTF-8', 'ISO-8859-1');
all should do the same - with utf8_en/decode()
requiring no special extension, mb_convert_encoding()
requiring ext/mbstring and iconv()
requiring ext/iconv.
First of all, don't use different encodings. It leads to a mess, and UTF-8 is definitely the one you should be using everywhere.
Chances are your input is not ISO-8859-1, but something else (ISO-8859-15, Windows-1252). To convert from those, use iconv or mb_convert_encoding
.
Nevertheless, utf8_encode
and utf8_decode
should work for ISO-8859-1. It would be nice if you could post a link to a file or a uuencoded or base64 example string for which the conversion fails or yields unexpected results.
set meta tag in head as
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
use the link http://www.i18nqa.com/debug/utf8-debug.html to replace the symbols character you want.
then use str_replace like
$find = array('“', '’', '…', '—', '–', '‘', 'é', 'Â', '•', 'Ëœ', 'â€'); // en dash
$replace = array('“', '’', '…', '—', '–', '‘', 'é', '', '•', '˜', '”');
$content = str_replace($find, $replace, $content);
Its the method i use and help alot. Thanks!
It is much better to use
$value = mb_convert_encode($value,'HTML-ENTITIES','UTF-8');
Specially when you are using AJAX call for submitting 'ISO-8859-1' characters. It works for Chinese, Japanese, Czech, German and many more languages.
You need to use the iconv package, specifically its iconv function.
I use this function:
function formatcell($data, $num, $fill=" ") {
$data = trim($data);
$data=str_replace(chr(13),' ',$data);
$data=str_replace(chr(10),' ',$data);
// translate UTF8 to English characters
$data = iconv('UTF-8', 'ASCII//TRANSLIT', $data);
$data = preg_replace("/[\'\"\^\~\`]/i", '', $data);
// fill it up with spaces
for ($i = strlen($data); $i < $num; $i++) {
$data .= $fill;
}
// limit string to num characters
$data = substr($data, 0, $num);
return $data;
}
echo formatcell("YES UTF8 String Zürich", 25, 'x'); //YES UTF8 String Zürichxxx
echo formatcell("NON UTF8 String Zurich", 25, 'x'); //NON UTF8 String Zurichxxx
Check out my function in my blog http://www.unexpectedit.com/php/php-handling-non-english-characters-utf8
I used:
function utf8_to_html ($data) {
return preg_replace(
array (
'/ä/',
'/ö/',
'/ü/',
'/é/',
'/à/',
'/è/'
),
array (
'ä',
'ö',
'ü',
'é',
'à',
'è'
),
$data
);
}
In my case after files with names containing those characters were uploaded, they were not even visible with Filezilla! In Cpanel filemanager they were shown with ? (under black background). And this combination made it shown correctly on the browser (HTML document is Western-encoded):
$dspFileName = utf8_decode(htmlspecialchars(iconv(mb_internal_encoding(), 'utf-8', basename($thisFile['path']))) );
Use html_entity_decode()
and htmlentities()
.
$html = html_entity_decode(htmlentities($html, ENT_QUOTES, 'UTF-8'), ENT_QUOTES , 'ISO-8859-1');
htmlentities()
formats your input into UTF8
and html_entity_decode()
formats it back to ISO-8859-1
.
function parseUtf8ToIso88591(&$string){
if(!is_null($string)){
$iso88591_1 = utf8_decode($string);
$iso88591_2 = iconv('UTF-8', 'ISO-8859-1', $string);
$string = mb_convert_encoding($string, 'ISO-8859-1', 'UTF-8');
}
}
来源:https://stackoverflow.com/questions/374425/convert-utf8-characters-to-iso-88591-and-back-in-php