I have a millions of string record like this one with 310 types of them that have different format to get sequence,year,month and day from..
the script will get the seq
You don't want to be looking at dual
at all here; certainly not attempting to insert. You need to track the highest and lowest values you've seen as you iterate through the loop. based on some of the elements of ename
representing dates I'm pretty sure you want all your matches to be 0-9
, not 1-9
. You're also referring to the cursor name as you access its fields, instead of the record variable name:
FOR List_ENAME_rec IN List_ENAME_cur loop
if REGEXP_LIKE(List_ENAME_rec.ENAME,'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]') then
V_seq := substr(List_ENAME_rec.ename,5,4);
V_Year := substr(List_ENAME_rec.ename,10,2);
V_Month := substr(List_ENAME_rec.ename,13,2);
V_day := substr(List_ENAME_rec.ename,16,2);
if min_seq is null or V_seq < min_seq then
min_seq := v_seq;
end if;
if max_seq is null or V_seq > max_seq then
max_seq := v_seq;
end if;
end if;
end loop;
With values in the table of emp-1111_14_01_01_1111_G1
and emp-1115_14_02_02_1111_G1
, that reports max_seq 1115 min_seq 1111
.
If you really wanted to involve dual you could do this inside the loop, instead of the if/then/assign pattern, but it's not necessary:
select least(min_seq, v_seq), greatest(max_seq, v_seq)
into min_seq, max_seq
from dual;
I have no idea what the procedure is going to do; there seems to be no relationship between whatever you've got in test1
and the values you're finding.
You don't need any PL/SQL for this though. You can get the min/max values from a simple query:
select min(to_number(substr(ename, 5, 4))) as min_seq,
max(to_number(substr(ename, 5, 4))) as max_seq
from table1
where status = 2
and regexp_like(ename,
'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]')
MIN_SEQ MAX_SEQ
---------- ----------
1111 1115
And you can use those to generate a list of all values in that range:
with t as (
select min(to_number(substr(ename, 5, 4))) as min_seq,
max(to_number(substr(ename, 5, 4))) as max_seq
from table1
where status = 2
and regexp_like(ename,
'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]')
)
select min_seq + level - 1 as seq
from t
connect by level <= (max_seq - min_seq) + 1;
SEQ
----------
1111
1112
1113
1114
1115
And a slightly different common table expression to see which of those don't exist in your table, which I think is what you're after:
with t as (
select to_number(substr(ename, 5, 4)) as seq
from table1
where status = 2
and regexp_like(ename,
'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]')
),
u as (
select min(seq) as min_seq,
max(seq) as max_seq
from t
),
v as (
select min_seq + level - 1 as seq
from u
connect by level <= (max_seq - min_seq) + 1
)
select v.seq as missing_seq
from v
left join t on t.seq = v.seq
where t.seq is null
order by v.seq;
MISSING_SEQ
-----------
1112
1113
1114
or if you prefer:
...
select v.seq as missing_seq
from v
where not exists (select 1 from t where t.seq = v.seq)
order by v.seq;
SQL Fiddle.
Based on comments I think you want the missing values for the sequence for each combination of the other elements of the ID (YY_MM_DD). This will give you that breakdown:
with t as (
select to_number(substr(ename, 5, 4)) as seq,
substr(ename, 10, 2) as yy,
substr(ename, 13, 2) as mm,
substr(ename, 16, 2) as dd
from table1
where status = 2
and regexp_like(ename,
'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]')
),
r (yy, mm, dd, seq, max_seq) as (
select yy, mm, dd, min(seq), max(seq)
from t
group by yy, mm, dd
union all
select yy, mm, dd, seq + 1, max_seq
from r
where seq + 1 <= max_seq
)
select yy, mm, dd, seq as missing_seq
from r
where not exists (
select 1 from t
where t.yy = r.yy
and t.mm = r.mm
and t.dd = r.dd
and t.seq = r.seq
)
order by yy, mm, dd, seq;
With output like:
YY MM DD MISSING_SEQ
---- ---- ---- -------------
14 01 01 1112
14 01 01 1113
14 01 01 1114
14 02 02 1118
14 02 02 1120
14 02 03 1127
14 02 03 1128
SQL Fiddle.
If you want to look for a particular date you cold filter that (either in t
, or the first branch in r
), but you could also change the regex pattern to include the fixed values; so to look for 14 06
the pattern would be 'emp[-][0-9]{4}_14_06_[0-9]{2}[_][0-9]{4}[_][G][1]'
, for example. That's harder to generalise though, so a filter (where t.yy = '14' and t.mm = '06'
might be more flexible.
If you insist in having this in a procedure, you can make the date elements optional and modify the regex pattern:
create or replace procedure show_missing_seqs(yy in varchar2 default '[0-9]{2}',
mm in varchar2 default '[0-9]{2}', dd in varchar2 default '[0-9]{2}') as
pattern varchar2(80);
cursor cur (pattern varchar2) is
with t as (
select to_number(substr(ename, 5, 4)) as seq,
substr(ename, 10, 2) as yy,
substr(ename, 13, 2) as mm,
substr(ename, 16, 2) as dd
from table1
where status = 2
and regexp_like(ename, pattern)
),
r (yy, mm, dd, seq, max_seq) as (
select yy, mm, dd, min(seq), max(seq)
from t
group by yy, mm, dd
union all
select yy, mm, dd, seq + 1, max_seq
from r
where seq + 1 <= max_seq
)
select yy, mm, dd, seq as missing_seq
from r
where not exists (
select 1 from t
where t.yy = r.yy
and t.mm = r.mm
and t.dd = r.dd
and t.seq = r.seq
)
order by yy, mm, dd, seq;
begin
pattern := 'emp[-][0-9]{4}[_]'
|| yy || '[_]' || mm || '[_]' || dd
|| '[_][0-9]{4}[_][G][1]';
for rec in cur(pattern) loop
dbms_output.put_line(to_char(rec.missing_seq, 'FM0000'));
end loop;
end show_missing_seqs;
/
I don't know why you insist it has to be done like this or why you want to use dbms_output
as you're relying on the client/caller displaying that; what will your job do with the output? You could make this return a sys_refcursor
which would be more flexible. but anyway, you can call it like this from SQL*Plus/SQL Developer:
set serveroutput on
exec show_missing_seqs(yy => '14', mm => '01');
anonymous block completed
1112
1113
1114