问题
My question is: Is there a UTL_MATCH-like function which works with a CLOB
rather than a VARCHAR2
?
My specific problem is: I'm on an Oracle database. I have a bunch of pre-written queries which interface with Domo CenterView. The queries have variables in them defined by ${variableName}
. I need to rewrite these queries. I didn't write the original so instead of figuring out what a good value for the variables should be I want to run the queries with the application and get what the query was from V$SQL.
So my solution is: Do a UTL_MATCH
on the queries with the variable stuff in it and V$SQL.SQL_FULLTEXT
. However, UTL_MATCH
is limited to VARCHAR2
and the datatype of V$SQL.SQL_FULLTEXT
is CLOB
. So, this is why I'm looking for a UTL_MATCH
-like function which works with a CLOB
datatype.
Any other tips of how to accomplish this are welcome. Thanks!
Edit, about the tips. If you have a better idea of how to do this, let me just tell you some information I've got at my disposal. I have about 100 queries, they're all in an excel spreadsheet (the ones with the ${variableName}
in them). So I could pretty easily use excel to write a query for me. I'm hoping to just union all those queries together and copy the output to another sheet. Anyway, maybe that's helpful if you're thinking there's a better way to do this.
An example: Let's say I have the following query from Domo:
select department.dept_name
from department
where department.id = '${selectedDepartmentId}'
;
I want to call something like this:
select v.sql_fulltext
from v$sql v
where utl_match.jaro_winkler_similarity(v.sql_fulltext,
'select department.dept_name
from department
where department.id = ''${selectedDepartmentId}''') > 90
;
And get something like this in return:
SQL_FULLTEXT
------------------------------------------
select department.dept_name
from department
where department.id = '154'
What I've tried:
I tried substringing the clob and casting it to a varchar. I was really hopeful this would work, but it gives me an error. Here's the code:
select v.sql_fulltext
from v$sql v
where utl_match.jaro_winkler_similarity( cast( substr (v.sql_fulltext, 0, 4000) as varchar2 (4000)),
'select department.dept_name
from department
where department.id = ''${selectedDepartmentId}''') > 90
;
And here's the error:
ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual: 8000, maximum: 4000)
However, if I run this it works fine:
select cast(substr(v.sql_fulltext, 0, 4000) as varchar2 (4000))
from v$sql v
;
So I'm not sure what the problem is with casting the substring...
回答1:
UTL_MATCH is a packaging for comparing strings with regards for checking how similar two strings are. Its functions evaluate strings and return scores. So all you're going to get is a number indicating (say) how many edits you need to turn ${variableName}
into "Farmville" or "StackOveflow".
What you won't get is the actual differences: these two strings of text are identical except at offset 123 where it replaces ${variableName}
with "Farmville".
Putting it like that suggests an alternative approach. Using INSTR() and SUBSTR() to locate instances of ${variableName}
in your Domo CenterView queries and use those offsets to identify the different text in the v$sql.fulltext
equivalents. You can do this with CLOB in PL/SQL with the DBMS_LOB package.
回答2:
If the text you want to search has length <= 32767, then you can just convert the CLOB
to VARCHAR2
using DBMS_LOB.SUBSTR
:
select v.sql_fulltext
from v$sql v
where utl_match.jaro_winkler_similarity(dbms_lob.substr(v.sql_fulltext), 'select department.dept_name from department where department.id = ''${selectedDepartmentId}''') > 90 ;
回答3:
I ended up creating a custom function for it. Here's the code:
CREATE OR REPLACE function match_clob(clob_1 clob, clob_2 clob) return number as
similar number := 0;
sec_similar number := 0;
sections number := 0;
max_length number := 3949;
length_1 number;
length_2 number;
vchar_1 varchar2 (3950);
vchar_2 varchar2 (3950);
begin
length_1 := length(clob_1);
length_2 := length(clob_2);
--dbms_output.put_line('length_1: '||length_1);
--dbms_output.put_line('length_2: '||length_2);
IF length_1 > max_length or length_2 > max_length THEN
FOR x IN 1 .. ceil(length_1 / max_length) LOOP
--dbms_output.put_line('((x-1)*max_length) + 1'||(x-1)||' * '||max_length||' = '||(((x-1)*max_length) + 1));
vchar_1 := substr(clob_1, ((x-1)*max_length) + 1, max_length);
vchar_2 := substr(clob_2, ((x-1)*max_length) + 1, max_length);
-- dbms_output.put_line('Section '||sections||' vchar_1: '||vchar_1||' ==> vchar_2: '||vchar_2);
sec_similar := UTL_MATCH.JARO_WINKLER_SIMILARITY(vchar_1, vchar_2);
--dbms_output.put_line('sec_similar: '||sec_similar);
similar := similar + sec_similar;
sections := sections + 1;
END LOOP;
--dbms_output.put_line('Similar: '||similar||' ==> Sections: '||sections);
similar := similar / sections;
ELSE
similar := UTL_MATCH.JARO_WINKLER_SIMILARITY(clob_1,clob_2);
END IF;
--dbms_output.put_line('Overall Similar: '||similar);
return(similar);
end;
/
来源:https://stackoverflow.com/questions/10703609/utl-match-like-function-to-work-with-clob