strlen() php function giving the wrong length of unicode characters [duplicate]

冷暖自知 提交于 2019-12-09 15:54:14

问题


I am trying to get the length of this unicode characters string

$text = 'نام سلطان م';
$length = strlen($text);
echo $length;

output

20

How it determines the length of unicode characters string?


回答1:


strlen() is not handling multibyte characters correctly, as it assumes 1 char equals 1 byte, which is simply invalid for unicode. This behavior is clearly documented:

strlen() returns the number of bytes rather than the number of characters in a string.

The solution is to use mb_strlen() function instead (mb stands for multi byte) (see mb_strlen() docs).

EDIT

If for any reason change in code is not possible/doable, one may want to ensure string functions are automatically overloaded by multi-byte counterparts:

To use function overloading, set mbstring.func_overload in php.ini to a positive value that represents a combination of bitmasks specifying the categories of functions to be overloaded. It should be set to 1 to overload the mail() function. 2 for string functions, 4 for regular expression functions. For example, if it is set to 7, mail, strings and regular expression functions will be overloaded.

This is supported by PHP and documented here (note it's deprecated since PHP 7.2).

Please note that you may also need to edit your php.ini to ensure mb_string module is enabled. Available settings are documented here.




回答2:


You are looking for mb_strlen.




回答3:


Function strlnen does not count the number of characters, but the number of bytes. For multibyte characters it will return higher numbers.
Use mb_strlen() instead to count the actual count of characters.




回答4:


Just as an addendum to the other answers that reference mb_strlen():

If the php.in setting mbstring.func_overload has bit 2 set to 1, then strlen will count characters based on the default charset; otherwise it will count the number of bytes in the string



来源:https://stackoverflow.com/questions/15829554/strlen-php-function-giving-the-wrong-length-of-unicode-characters

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!