Oracle REGEXP_INSTR() and “a-z” character range doesn't match as expected

后端 未结 2 1751
说谎
说谎 2021-01-22 08:04

I want to use REGEXP_INSTR() within an oracle database to check for lower/uppercase characters. I\'m aware of [:upper:] and [:lower:] POSI

相关标签:
2条回答
  • 2021-01-22 08:47

    Okay, the answer that NLS_SORT causes this behavior is correct, but I don't think it explains it in an understandable way. None of the documentation I found actually does that...

    You have to imagine that the character ranges defined by [a-z] are actually derived from a single substring of all possible characters which are sorted depending on NLS_SORT.

    Lets assume the whole alphabet is just alphanumerical characters. Sorted by BINARY this results in a base string like 0123456789abcdefgh...xyzABCDE...XYZ. Derived from this, [0-6] expands to [0123456], [a-f] to [abcdef], [5-b] to [56789ab] etc.

    Sorted by a linguistic_definition however results in a different base string, like 0123456789aAbBcCdDeF...xXyYzZ. Derived from this, [0-6] still expands to [0123456], but [a-f] now expands to [aAbBcCdDeEf] and [5-b] to [56789aAb] etc...

    This is why a did not match [A-Z], but b did. [A-Z] actually expands to [AbBcC...yYzZ] which includes z but not a.

    In reality [A-Z] might even contain more characters, like [aAàáâÀÁÂ...] etc.

    0 讨论(0)
  • 2021-01-22 08:48

    The reason for the behavior is the collation rules. See the NLS_SORT documentation:

    • If the value is BINARY, then the collating sequence for ORDER BY queries is based on the numeric value of characters (a binary sort that requires less system overhead).
    • If the value is a named linguistic sort, sorting is based on the order of the defined linguistic sort. Most (but not all) languages supported by the NLS_LANGUAGE parameter also support a linguistic sort with the same name.

    Set the NLS_SORT to BINARY so that the [A-Z] could be parsed in the same order as in the ASCII table,

    alter session set nls_sort = 'BINARY'
    

    Then, you will get consistent results.

    See the online demo.

    0 讨论(0)
提交回复
热议问题