Matching UTF Characters with preg_match in PHP: (*UTF8) Works on Windows but not Linux

后端 未结 3 472
别那么骄傲
别那么骄傲 2020-12-10 07:16

I have a simple regular expression to check a username:

preg_match(\'/(*UTF8)^[[:alnum:]]([[:alnum:]]|[ _.-])+$/i\', $username);

In local t

相关标签:
3条回答
  • 2020-12-10 08:10

    Try it by describing the characters by its Unicode character properties:

    preg_match('/^\p{L}[\p{L} _.-]+$/u', $username)
    
    0 讨论(0)
  • 2020-12-10 08:11

    it seems it is an old post but as it is always a subject of interest I will post what I discovered here. It is a small difference but makes code more simple. The thing is that curly brackets are optional.

    The above code of Gumbo and Scott can be written more simple like this if someone wants to allow only letters (Unicode & non-Unicode) and blank spaces:

    preg_match("/^\pL[\pL ]+$/u",$string)
    

    I also noticed that preg_match accepts even more simple code as the following :

    preg_match("/^[\pL ]+$/u",$string)
    
    0 讨论(0)
  • 2020-12-10 08:13

    I had already been trying with the /u parameter mentioned. On windows (PHP 5.2.16), adding the /u parameter worked fine for capturing a string containing unicode characters, however on CentOS 5 and PHP 5.2.16 i could still not capture a string containing unicode characters, using .* (preg_match basically failed to capture).

    After a long time getting nowhere, messing around with the 'LOCALE' settings which changed nothing, i finally found this site.

    I did an rpm -Uvh of the appropriate version rpm provided, restarted apache, and suddenly my regexes worked great!

    Even though I had UTF-8 support initially, my regexes were not capturing unicode strings until I installed the updated rpm, which also adds "Unicode properties support". I thought having UTF-8 support would have been enough, but apparently not.

    0 讨论(0)
提交回复
热议问题