问题
I am trying to write a query that will provide all non UTF-8 encoded characters in a table, that is not specific to a column name. I am doing so by comparing the length of a column not equal to the byte length. %1 is the table name I want to check entered in a parameter. I am joining to user_tab_columns to get the COLUMN_NAME. I then want to take the COLUMN_NAME results and filter down to only show rows that have bad UTF-8 data (where length of a column is not equal to the byte length). Below is what I have come up with but it's not functioning. Can somebody help me tweak this query to get desired results?
SELECT
user_tab_columns.TABLE_NAME,
user_tab_columns.COLUMN_NAME AS ColumnName,
a.*
FROM %1 a
JOIN user_tab_columns
ON UPPER(user_tab_columns.TABLE_NAME) = UPPER('%1')
WHERE (SELECT * FROM %1 WHERE LENGTH(a.ColumnName) != LENGTHB(a.ColumnName))
回答1:
In your query LENGTH(a.ColumnName)
would represent the length of the column name, not the contents of that column. You can't use a value from one table as the column name in another table in static SQL.
Here's a simple demonstration of using dynamic SQL in an anonymous block to report which columns contain any multibyte characters, which is what comparing length
with lengthb
will tell you (discussed in comments to not rehashing that here):
set serveroutput on size unlimited
declare
sql_str varchar2(256);
flag pls_integer;
begin
for rec in (
select utc.table_name, utc.column_name
from user_tab_columns utc
where utc.table_name = <your table name or argument>
and utc.data_type in ('VARCHAR2', 'NVARCHAR2', 'CLOB', 'NCLOB')
order by utc.column_id
) loop
sql_str := 'select nvl(max(1), 0) from "' || rec.table_name || '" '
|| 'where length("' || rec.column_name || '") '
|| '!= lengthb("' || rec.column_name || '") and rownum = 1';
-- just for debugging, to see the generated query
dbms_output.put_line(sql_str);
execute immediate sql_str into flag;
-- also for debugging
dbms_output.put_line (rec.table_name || '.' || rec.column_name
|| ' flag: ' || flag);
if flag = 1 then
dbms_output.put_line(rec.table_name || '.' || rec.column_name
|| ' contains multibyte characters');
end if;
end loop;
end;
/
This uses a cursor loop to get the column names - I've included the table name too in case you want to wild-card or remove the filter - and inside that loop constructs a dynamic SQL statement, executes it into a variable, and then checks that variable. I've left some debugging output in to see what's happening. With a dummy table created as:
create table t42 (x varchar2(20), y varchar2(20));
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'multibyte ' || unistr('\00FF'));
running that block gets the output:
anonymous block completed
select nvl(max(1), 0) from "T42" where length("X") != lengthb("X") and rownum = 1
T42.X flag: 0
select nvl(max(1), 0) from "T42" where length("Y") != lengthb("Y") and rownum = 1
T42.Y flag: 1
T42.Y contains multibyte characters
To display the actual multibyte-containing values you could use a dynamic loop over the selected values:
set serveroutput on size unlimited
declare
sql_str varchar2(256);
curs sys_refcursor;
val_str varchar(4000);
begin
for rec in (
select utc.table_name, utc.column_name
from user_tab_columns utc
where utc.table_name = 'T42'
and utc.data_type in ('VARCHAR2', 'NVARCHAR2', 'CLOB', 'NCLOB')
order by utc.column_id
) loop
sql_str := 'select "' || rec.column_name || '" '
|| 'from "' || rec.table_name || '" '
|| 'where length("' || rec.column_name || '") '
|| '!= lengthb("' || rec.column_name || '")';
-- just for debugging, to see the generated query
dbms_output.put_line(sql_str);
open curs for sql_str;
loop
fetch curs into val_str;
exit when curs%notfound;
dbms_output.put_line (rec.table_name || '.' || rec.column_name
|| ': ' || val_str);
end loop;
end loop;
end;
/
Which with the same table gets:
anonymous block completed
select "X" from "T42" where length("X") != lengthb("X")
select "Y" from "T42" where length("Y") != lengthb("Y")
T42.Y: multibyte ÿ
As a starting point anyway; it would need some tweaking if you have CLOB values, or NVARCHAR2 or NCLOB - for example you could have one local variable of each type, include the data type in the outer cursor query, and fetch into the appropriate local variable.
来源:https://stackoverflow.com/questions/28866626/oracle-sql-dynamic-query-non-utf-8-characters