php sprintf() with foreign characters?

情到浓时终转凉″ 提交于 2019-12-03 04:48:33

Strings in PHP are basically arrays of bytes (not characters). They cannot work natively with multibyte encodings (such as UTF-8).

For details see:
https://www.php.net/manual/en/language.types.string.php#language.types.string.details

Most string functions in PHP have multibyte equivalent though (with the mb_ prefix). But the sprintf does not.

There's a user comment (by "viktor at textalk dot com") with multibyte implementation of the sprintf on the function's documentation page at php.net. It may work for you:
https://www.php.net/manual/en/function.sprintf.php#89020

I was actually trying to find out if PHP ^7 finally has a native mb_sprintf() but apparently no xD.

For the sake of completeness, here is a simple solution I've been using in some old projects. It just adds the diff between strlen & mb_strlen to the desired $targetLengh. The non-multibyte example is just added for the sake of easy comparison =).

$text = "Gultigkeitsprufung ist fehlgeschlagen: %{errors}";
$mbText = "Gültigkeitsprüfung ist fehlgeschlagen: %{errors}";
$mbTextRussian = "Проверка не удалась: %{errors}";

$targetLength = 60;
$mbTargetLength = strlen($mbText) - mb_strlen($mbText) + $targetLength;
$mbRussianTargetLength = strlen($mbTextRussian) - mb_strlen($mbTextRussian) + $targetLength;

printf("%{$targetLength}s\n", $text);
printf("%{$mbTargetLength}s\n", $mbText);
printf("%{$mbRussianTargetLength}s\n", $mbTextRussian);

result

            Gultigkeitsprufung ist fehlgeschlagen: %{errors}
            Gültigkeitsprüfung ist fehlgeschlagen: %{errors}
                              Проверка не удалась: %{errors}

update 2019-06-12


@flowtron made me give it another thought. A simple mb_sprintf() could look like this.

function mb_sprintf($format, ...$args) {
    $params = $args;

    $callback = function ($length) use (&$params) {
        $value = array_shift($params);
        return strlen($value) - mb_strlen($value) + $length[0];
    };

    $format = preg_replace_callback('/(?<=%|%-)\d+(?=s)/', $callback, $format);

    return sprintf($format, ...$args);
}

echo mb_sprintf("%-10s %-10s %10s\n", 'thüs', 'wörks', 'ök');
echo mb_sprintf("%-10s %-10s %10s\n", 'this', 'works', 'ok');

result

thüs       wörks              ök
this       works              ok

I only did some happy path testing here, but it works for PHP >=5.6 and should be good enough to give ppl an idea on how to encapsulate the behavior. It does not work with the repetition/order modifiers though - e.g. %1$20s will be ignored/remain unchanged.

If you're using characters that fit in the ISO-8859-1 character set, you can convert the strings before formatting, and convert the result back to UTF8 when you are done

utf8_encode(sprintf("%-12s %-8s", utf8_decode($paramOne), utf8_decode($paramTwo))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!