I want to use REGEXP_INSTR()
within an oracle database to check for lower/uppercase characters. I\'m aware of [:upper:]
and [:lower:]
POSI
Okay, the answer that NLS_SORT causes this behavior is correct, but I don't think it explains it in an understandable way. None of the documentation I found actually does that...
You have to imagine that the character ranges defined by [a-z]
are actually derived from a single substring of all possible characters which are sorted depending on NLS_SORT
.
Lets assume the whole alphabet is just alphanumerical characters. Sorted by BINARY
this results in a base string like 0123456789abcdefgh...xyzABCDE...XYZ
.
Derived from this, [0-6]
expands to [0123456]
, [a-f]
to [abcdef]
, [5-b]
to [56789ab]
etc.
Sorted by a linguistic_definition
however results in a different base string, like 0123456789aAbBcCdDeF...xXyYzZ
.
Derived from this, [0-6]
still expands to [0123456]
, but [a-f]
now expands to [aAbBcCdDeEf]
and [5-b]
to [56789aAb]
etc...
This is why a
did not match [A-Z]
, but b
did. [A-Z]
actually expands to [AbBcC...yYzZ]
which includes z
but not a
.
In reality [A-Z]
might even contain more characters, like [aAàáâÀÁÂ...]
etc.