Oracle SQL Dynamic Query non UTF-8 Characters

做~自己de王妃 提交于 2021-01-29 18:16:09

问题


I am trying to write a query that will provide all non UTF-8 encoded characters in a table, that is not specific to a column name. I am doing so by comparing the length of a column not equal to the byte length. %1 is the table name I want to check entered in a parameter. I am joining to user_tab_columns to get the COLUMN_NAME. I then want to take the COLUMN_NAME results and filter down to only show rows that have bad UTF-8 data (where length of a column is not equal to the byte length). Below is what I have come up with but it's not functioning. Can somebody help me tweak this query to get desired results?

 SELECT
 user_tab_columns.TABLE_NAME,
 user_tab_columns.COLUMN_NAME AS ColumnName,
 a.*

 FROM %1 a

 JOIN user_tab_columns
 ON UPPER(user_tab_columns.TABLE_NAME) = UPPER('%1')

 WHERE (SELECT * FROM %1 WHERE LENGTH(a.ColumnName) != LENGTHB(a.ColumnName))

回答1:


In your query LENGTH(a.ColumnName) would represent the length of the column name, not the contents of that column. You can't use a value from one table as the column name in another table in static SQL.

Here's a simple demonstration of using dynamic SQL in an anonymous block to report which columns contain any multibyte characters, which is what comparing length with lengthb will tell you (discussed in comments to not rehashing that here):

set serveroutput on size unlimited
declare
  sql_str varchar2(256);
  flag pls_integer;
begin
  for rec in (
    select utc.table_name, utc.column_name
    from user_tab_columns utc
    where utc.table_name = <your table name or argument>
    and utc.data_type in ('VARCHAR2', 'NVARCHAR2', 'CLOB', 'NCLOB')
    order by utc.column_id
  ) loop

    sql_str := 'select nvl(max(1), 0) from "' || rec.table_name || '" '
      || 'where length("' || rec.column_name || '") '
      || '!= lengthb("' || rec.column_name || '") and rownum = 1';

    -- just for debugging, to see the generated query
    dbms_output.put_line(sql_str);

    execute immediate sql_str into flag;

    -- also for debugging
    dbms_output.put_line (rec.table_name || '.' || rec.column_name
      || ' flag: ' || flag);

    if flag = 1 then
      dbms_output.put_line(rec.table_name || '.' || rec.column_name
        || ' contains multibyte characters');
    end if;
  end loop;
end;
/

This uses a cursor loop to get the column names - I've included the table name too in case you want to wild-card or remove the filter - and inside that loop constructs a dynamic SQL statement, executes it into a variable, and then checks that variable. I've left some debugging output in to see what's happening. With a dummy table created as:

create table t42 (x varchar2(20), y varchar2(20));
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'multibyte ' || unistr('\00FF'));

running that block gets the output:

anonymous block completed
select nvl(max(1), 0) from "T42" where length("X") != lengthb("X") and rownum = 1
T42.X flag: 0
select nvl(max(1), 0) from "T42" where length("Y") != lengthb("Y") and rownum = 1
T42.Y flag: 1
T42.Y contains multibyte characters

To display the actual multibyte-containing values you could use a dynamic loop over the selected values:

set serveroutput on size unlimited
declare
  sql_str varchar2(256);
  curs sys_refcursor;
  val_str varchar(4000);
begin
  for rec in (
    select utc.table_name, utc.column_name
    from user_tab_columns utc
    where utc.table_name = 'T42'
    and utc.data_type in ('VARCHAR2', 'NVARCHAR2', 'CLOB', 'NCLOB')
    order by utc.column_id
  ) loop

    sql_str := 'select "' || rec.column_name || '" '
      || 'from "' || rec.table_name || '" '
      || 'where length("' || rec.column_name || '") '
      || '!= lengthb("' || rec.column_name || '")';

    -- just for debugging, to see the generated query
    dbms_output.put_line(sql_str);

    open curs for sql_str;
    loop
      fetch curs into val_str;
      exit when curs%notfound;

      dbms_output.put_line (rec.table_name || '.' || rec.column_name
        || ': ' || val_str);
    end loop;
  end loop;
end;
/

Which with the same table gets:

anonymous block completed
select "X" from "T42" where length("X") != lengthb("X")
select "Y" from "T42" where length("Y") != lengthb("Y")
T42.Y: multibyte ÿ

As a starting point anyway; it would need some tweaking if you have CLOB values, or NVARCHAR2 or NCLOB - for example you could have one local variable of each type, include the data type in the outer cursor query, and fetch into the appropriate local variable.



来源:https://stackoverflow.com/questions/28866626/oracle-sql-dynamic-query-non-utf-8-characters

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!