问题
We have a database of Canadian addresses all in CAPS , the client requested that we transform to lower case expect the first letter and the letter after a '-'
So i made this function , but I'm having problem with french accented letters .
When having the file and charset as ISO-88591 It works fine , but when i try to make it UTF-8 it doesn't work anymore .
Example of input : 'damien-claude élanger' output : Damien-Claude élanger
the é in utf-8 will become �
function cap_letter($string) {
$lower = str_split("àáâçèéêë");
$caps = str_split("ÀÁÂÇÈÉÊË");
$letters = str_split(strtolower($string));
foreach($letters as $code => $letter) {
if($letter === '-' || $letter === ' ') {
$position = array_search($letters[$code+1],$lower);
if($position !== false) {
// test
echo $letters[$code+1] . ' == ' . $caps[$position] ;
$letters[$code+1] = $caps[$position];
}
else {
$letters[$code+1] = mb_strtoupper($letters[$code+1]);
}
}
}
//return ucwords(implode($letters)) ;
return implode($letters) ;
}
The Other solution i have in mind is to do : ucwords(strtolower($str)) since all the addresses are already in caps so the É will stay É even after applying strtolower .
But then I'll have the problem of having É inside ex : XXXÉXXÉ
回答1:
Try mb_*
string functions for multibyte characters.
echo mb_convert_case(mb_strtolower($str), MB_CASE_TITLE, "UTF-8");
回答2:
I have the same problem in spanish, and I create this function
function capitalize($string)
{
if (mb_detect_encoding($string) === 'UTF-8') {
$string = mb_convert_case(utf8_encode($string), MB_CASE_TITLE, 'UTF-8');
} else {
$string = mb_convert_case($string, MB_CASE_TITLE, 'UTF-8');
}
return $string;
}
来源:https://stackoverflow.com/questions/10012545/ucwords-and-french-accented-lettres-encoding