问题
I have a database which stores usernames only in English at the moment.
I would like to incorporate BASE64
& UTF-8
in order to store in other languages as well; I want to store it in a column of type NVARCHAR2
.
The database procedure receives the name in BASE64
, I'm decoding it via UTL_ENCODE.BASE64_DECODE
& converting the string to VARCHAR2
using UTL_RAW.CAST_TO_VARCHAR2
. But I get gibberish back and not the actual word.
For example I get 'алекс' as the name in BASE64
. I'm able to decode it but the cast to VARCHAR2/NVARCHAR2
does not return the value: I get only gibberish.
I'm running on Oracle 12c
using NLS_CHARACTERSET WE8ISO8859P1
Here is the code I use to decode:
DECLARE
lv_OrgUserName VARCHAR2(2000);
lv_encodedUserName VARCHAR2(2000);
lv_UserName VARCHAR2(2000);
BEGIN
lv_OrgUserName := 'алекс';
lv_encodedUserName := UTL_RAW.CAST_TO_VARCHAR2(UTL_ENCODE.BASE64_ENCODE(UTL_RAW.CAST_TO_RAW(lv_OrgUserName)));
DBMS_OUTPUT.PUT_LINE (lv_encodedUserName);
lv_UserName := UTL_RAW.CAST_TO_VARCHAR2(UTL_ENCODE.BASE64_DECODE(UTL_RAW.CAST_TO_RAW (lv_encodedUserName)));
DBMS_OUTPUT.PUT_LINE (lv_UserName);
END;
How can I overcome this?
回答1:
First and foremost WE8ISO8859P1 (Western European 8-bit ISO 8859 Part 1, or - ISO8859 Part 1) does not support cyryllic characters:
see this link: https://en.wikipedia.org/wiki/ISO/IEC_8859-1
Therefore if you try to store a string like алекс
into VARCHAR2 variable/column, you will always get a????
as an outcome.
Probably during the database installation someone has not considered cyryllic characters and has choosen a bad codepage.
A better option is ISO/IEC 8859-5 (part 5), see this link: https://en.wikipedia.org/wiki/ISO/IEC_8859-5
One option is to change this encoding - but this is not easy and it is beyound of this question.
What you can do is to strictly use NVARCHAR2 datatype instead of VARCHAR2 datatype in all places of your application that must support cyrillic characters.
There are still some pitfalls though you need to be aware of:
- You cannot use
DBMS_OUTPUT
package to debug your code, because this package support only VARCHAR2 datatype, it doesn't support NVARCHAR - you must use
N'some string'
literals (with N prefix) in all literals -->'алекс'
is of VARCHAR2 datatype and it is always automatically converted to'a????'
in your encoding, whilen'алекс'
is of NVARCHAR2 datatype and such conversion doesn't occur.
The below code is tested on version 12c, I am using EE8MSWIN1250
code page (it also desn't support cyrillic characters):
select * from nls_database_parameters
where parameter like '%CHARACTERSET%';
PARAMETER VALUE
----------------------- ------------
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_CHARACTERSET EE8MSWIN1250
please give it a try:
CREATE OR REPLACE PACKAGE my_base64 AS
FUNCTION BASE64_ENCODE( str nvarchar2 ) RETURN varchar2;
FUNCTION BASE64_DECODE( str varchar2 ) RETURN nvarchar2;
END;
/
CREATE OR REPLACE PACKAGE BODY my_base64 AS
FUNCTION BASE64_ENCODE( str nvarchar2 ) RETURN varchar2
IS
lv_encodedUserName VARCHAR2(2000);
BEGIN
lv_encodedUserName := UTL_RAW.CAST_TO_VARCHAR2(UTL_ENCODE.BASE64_ENCODE(UTL_RAW.CAST_TO_RAW(str)));
RETURN lv_encodedUserName;
END;
FUNCTION BASE64_DECODE( str varchar2 ) RETURN nvarchar2
IS
lv_UserName nVARCHAR2(2000);
BEGIN
lv_UserName := UTL_RAW.CAST_TO_nVARCHAR2(UTL_ENCODE.BASE64_DECODE(UTL_RAW.CAST_TO_RAW (str)));
RETURN lv_UserName;
END;
END;
/
A few examples:
select 'aлекс' As A, n'aлекс' As B from dual;
A B
----- -----
a???? aлекс
select my_base64.BASE64_ENCODE( n'аaaлекс' ) As aleks from dual;
ALEKS
--------------------------------------------------------------------------------
BDAAYQBhBDsENQQ6BEE=
select my_base64.BASE64_DECODE( 'BDAAYQBhBDsENQQ6BEE=' ) as aleks from dual;
ALEKS
--------------------------------------------------------------------------------
аaaлекс
select my_base64.BASE64_DECODE( my_base64.BASE64_ENCODE( n'аaaлекс' ) ) as Aleks from dual;
ALEKS
--------------------------------------------------------------------------------
аaaлекс
来源:https://stackoverflow.com/questions/37907398/plsql-decode-nvarchar2-from-base64-to-utf-8