I\'ve encountered here an inusited situation that I couldn\'t understand. Nor the documentation of the functions that I will write about has something to light up this thing
The function "nlssort()" return binary with extra 00 at the end of original binary of string.
Testing:
select NLSSORT('abc') from dual
Output:
61626300
this problem can be resolved by removing last 2 digits from NLSSORT's return.
Solution:
select a, length(a), b, length(b)
from ( select 'FGHJTÓRYO DE YHJKS DA DGHQÇÃA DE ASGA XCVBGL EASDEÔNASD' a,
replace(
utl_raw.cast_to_varchar2(
substr(nlssort('FGHJTÓRYO DE YHJKS DA DGHQÇÃA DE ASGA XCVBGL EASDEÔNASD', 'nls_sort=binary_ai'),1,
length(nlssort('FGHJTÓRYO DE YHJKS DA DGHQÇÃA DE ASGA XCVBGL EASDEÔNASD', 'nls_sort=binary_ai'))-2
)
)
,' ','_') b
from dual
)
)
1) Oracle distinguishes lengths in bytes and lengths in characters: varchar2(55)
means 55 bytes, so 55 UTF-8 characters fit only if you are lucky: you should declare your field as varchar2 (55 char)
.
2) Contortions like
replace(utl_raw.cast_to_varchar2(nlssort(
'FGHJTÓRYO DE YHJKS DA DGHQÇÃA DE ASGA XCVBGL EASDEÔNASD',
'nls_sort=binary_ai')),' ','_') b
are nonsense, you are merely replacing strings with somewhat similar ones. Your database has an encoding, and all strings are represented with that encoding, which determines their length in bytes; the arbitrary variations mcalmeida explains introduce random data-dependent noise, never a good thing if you are making comparisons.
3) Regarding the stated task of removing accents, you should do it yourself with REPLACE, TRANSLATE etc. because only you know your requirements; it isn't Unicode normalization or anything "standard", there are no shortcuts. You can define a function and call it from any query and any PL/SQL program, without ugly copying and pasting.
The 'nlssort' function's documentation does not state that the output string will be a normalization of the input string, or that they will have same length. The purpose of the function is to return data that can be used to sort the input string.
See http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions113.htm#SQLRF51561
It is tempting to use it to normalize your string since apparently it works, but you are gambling here...
Heck, it could even yield a LENGTH(b)=200 and still be doing what it is supposed to do :)