How to check if letter is upper or lower in PHP?

后端 未结 13 1119
忘了有多久
忘了有多久 2020-12-04 19:12

I have texts in UTF-8 with diacritic characters also, and would like to check if first letter of this text is upper case or lower case. How to do this?

相关标签:
13条回答
  • 2020-12-04 19:40

    It is my opinion that making a preg_ call is the most direct, concise, and reliable call versus the other posted solutions here.

    echo preg_match('~^\p{Lu}~u', $string) ? 'upper' : 'lower';
    

    My pattern breakdown:

    ~      # starting pattern delimiter 
    ^      #match from the start of the input string
    \p{Lu} #match exactly one uppercase letter (unicode safe)
    ~      #ending pattern delimiter 
    u      #enable unicode matching
    

    Please take notice when ctype_ and < 'a' fail with this battery of tests.

    Code: (Demo)

    $tests = ['âa', 'Bbbbb', 'Éé', 'iou', 'Δδ'];
    
    foreach ($tests as $test) {
        echo "\n{$test}:";
        echo "\n\tPREG:  " , preg_match('~^\p{Lu}~u', $test)      ? 'upper' : 'lower';
        echo "\n\tCTYPE: " , ctype_upper(mb_substr($test, 0, 1))  ? 'upper' : 'lower';
        echo "\n\t< a:   " , mb_substr($test, 0, 1) < 'a'         ? 'upper' : 'lower';
    
        $chr = mb_substr ($test, 0, 1, "UTF-8");
        echo "\n\tMB:    " , mb_strtoupper($chr, "UTF-8") == $chr ? 'upper' : 'lower';
    }
    

    Output:

    âa:
        PREG:  lower
        CTYPE: lower
        < a:   lower
        MB:    lower
    Bbbbb:
        PREG:  upper
        CTYPE: upper
        < a:   upper
        MB:    upper
    Éé:               <-- trouble
        PREG:  upper
        CTYPE: lower  <-- uh oh
        < a:   lower  <-- uh oh
        MB:    upper
    iou:
        PREG:  lower
        CTYPE: lower
        < a:   lower
        MB:    lower
    Δδ:               <-- extended beyond question scope
        PREG:  upper  <-- still holding up
        CTYPE: lower
        < a:   lower
        MB:    upper  <-- still holding up
    

    If anyone needs to differentiate between uppercase letters, lowercase letters, and non-letters see this post.


    It may be extending the scope of this question too far, but if your input characters are especially squirrelly (they might not exist in a category that Lu can handle), you may want to check if the first character has case variants:

    \p{L&} or \p{Cased_Letter}: a letter that exists in lowercase and uppercase variants (combination of Ll, Lu and Lt).

    • Source: https://www.regular-expressions.info/unicode.html

    To include Roman Numerals ("Number Letters") with SMALL variants, you can add that extra range to the pattern if necessary.

    https://www.fileformat.info/info/unicode/category/Nl/list.htm

    Code: (Demo)

    echo preg_match('~^[\p{Lu}\x{2160}-\x{216F}]~u', $test) ? 'upper' : 'not upper';
    
    0 讨论(0)
  • 2020-12-04 19:42

    As used in Kohana 2 autoloader function:

    echo $char < 'a' ? 'uppercase' : 'lowercase';
    

    When a string character is cast to integer it evaluates to its ASCII number. As you know in the ASCII table first there are some control characters and others. Then the uppercase letters from the Latin alphabet. And then the lowercase letters from the Latin alphabet. Thus you can easily check whether the code of a letter is smaller or bigger than the small latin character a.

    BTW this is around twice as fast than a solution with regular expressions.

    0 讨论(0)
  • 2020-12-04 19:42

    Another possible solution in PHP 7 is using IntlChar

    IntlChar provides access to a number of utility methods that can be used to access information about Unicode characters.

    $tests = ['âa', 'Bbbbb', 'Éé', 'iou', 'Δδ'];
    
    foreach ($tests as $test) {
        echo "{$test}:\t";
        echo IntlChar::isUUppercase(mb_substr($test, 0, 1)) ? 'upper' : 'lower';
        echo PHP_EOL; 
    }
    

    Output:

    âa:     lower
    Bbbbb:  upper
    Éé:     upper
    iou:    lower
    Δδ:     upper
    

    While @mickmackusa's solution is good, it will give wrong result for different general category values (other than "Lu" uppercase letter category )

    For example

    • Ⅷ => ⅷ
    • Ⅼ => ⅼ
    • Ⅿ => ⅿ
    • Ⅾ => ⅾ
    • Ⅽ => ⅽ

     var_dump(preg_match('~^\p{Lu}~u', 'Ⅷ') ? 'upper' : 'lower'); // Resutl: lower
     var_dump(preg_match('~^\p{Lu}~u', 'ⅷ') ? 'upper' : 'lower'); // Result: lower
    

    But

    var_dump(IntlChar::isUUppercase(mb_substr('Ⅷ', 0, 1)) ? 'upper' : 'lower'); // Result: upper    
    var_dump(IntlChar::isUUppercase(mb_substr('ⅷ', 0, 1)) ? 'upper' : 'lower'); // Result: lower   
    

    Make sure to use IntlChar::isUUppercase but not IntlChar::isupper if you want to check for characters that are also uppercase but have a different general category value

    Note: This library depends on intl (Internationalization extension)

    0 讨论(0)
  • 2020-12-04 19:46
    if(ctype_upper(&value)){
        echo 'uppercase';
    }
    else {
        echo 'not upper case';
    }
    
    0 讨论(0)
  • 2020-12-04 19:50

    Note that PHP provides the ctype family like ctype_upper.

    You have to set the locale correctly via setLocale() first to get it to work with UTF-8.
    See the comment on ctype_alpha for instance.

    Usage:

    if ( ctype_upper( $str[0] )) {
        // deal with 1st char of $str is uppercase
    }
    
    0 讨论(0)
  • 2020-12-04 19:52

    What about just:

    if (ucfirst($string) == $string) {dosomething();}
    
    0 讨论(0)
提交回复
热议问题