问题
I have a table and in this table i have data which has data integrity issue, since this is a dimension table we need to maintain the effective_dt_from
and effective_dt_to
and version correctly.
This is the table and sample data:
create table TEST
(
LOC_SID NUMBER(38,0),
POSTAL_CD VARCHAR2(15 BYTE),
COUNTRY_CD_2CHAR VARCHAR2(2 BYTE),
CITY VARCHAR2(180 BYTE),
DISTRICT_CD VARCHAR2(120 BYTE),
POPULATION_APPROX VARCHAR2(15 BYTE),
EFFECTIVE_DT_FROM DATE,
EFFECTIVE_DT_TO DATE,
VERSION NUMBER(38,0)
);
Sample data:
INSERT INTO TEST VALUES (81910, 1234, UK, 'Liverpool', 50, 1000, to_date('01.01.00 00:00:00', 'DD.MM.YY HH24:MI:SS'), to_date('31.12.99 23:59:59', 'DD.MM.YY HH24:MI:SS'), 1)
INSERT INTO TEST VALUES (81911,1234,UK,'Liverpool',50,0,to_date('01.01.00 00:00:00','DD.MM.YY HH24:MI:SS'),to_date('31.12.99 23:59:59','DD.MM.YY HH24:MI:SS'),1)
INSERT INTO TEST VALUES (81912,4567,UK,'Liverpool',50,2000,to_date('01.01.00 00:00:00','DD.MM.YY HH24:MI:SS'),to_date('31.12.99 23:59:59','DD.MM.YY HH24:MI:SS'),1)
INSERT INTO TEST VALUES (81913,4567,UK,'Liverpool',50,0,to_date('01.01.00 00:00:00','DD.MM.YY HH24:MI:SS'),to_date('31.12.99 23:59:59','DD.MM.YY HH24:MI:SS'),1)
Data integrity check query:
SELECT
COUNT(*) AS RowAffected
FROM
(SELECT
LOC_SID,
VERSION,
EFFECTIVE_DT_FROM, EFFECTIVE_DT_TO,
CITY,
POSTAL_CD
FROM
(SELECT
t.*,
LEAD(EFFECTIVE_DT_FROM, 1) OVER (PARTITION BY POSTAL_CD, COUNTRY_CD_2CHAR, CITY, DISTRICT_CD
ORDER BY EFFECTIVE_DT_FROM) AS next_date,
LEAD(VERSION, 1) OVER (PARTITION BY POSTAL_CD, COUNTRY_CD_2CHAR, CITY, DISTRICT_CD
ORDER BY EFFECTIVE_DT_FROM) AS next_version
FROM
TEST t)
WHERE
valid_to != next_date OR VERSION = next_version)
Results:
CITY POSTAL_CD COUNT(*)
--------------------------------
Liverpool 1234 31
Existing data:
LOC_SID POSTAL_CD COUNTRY_CD_2CHAR CITY DISTRICT_CD POPULATION_APPROX EFFECTIVE_DT_FROM DATE EFFECTIVE_DT_TO VERSION
81910 1234 UK LIVERPOOL 50 1000 01.01.1900 31.12.2199 1
81921 1234 UK LIVERPOOL 50 0 01.01.1900 31.12.2199 1
81919 1234 UK LIVERPOOL 50 2000 01.01.1900 31.12.2199 1
81913 1234 UK LIVERPOOL 50 0 01.01.1900 31.12.2199 1
Expected data:
LOC_SID POSTAL_CD COUNTRY_CD_2CHAR CITY DISTRICT_CD POPULATION_APPROX EFFECTIVE_DT_FROM DATE EFFECTIVE_DT_TO VERSION
81910 1234 UK LIVERPOOL 50 1000 01.01.1900 27.10.2016 1
81921 1234 UK LIVERPOOL 50 0 27.10.2016 28.10.2016 2
81919 1234 UK LIVERPOOL 50 2000 01.01.1900 31.12.2199 1
81913 1234 UK LIVERPOOL 50 0 27.10.2016 28.10.2016 2
Please note: I have 131 records in the table with city - Liverpool and basically I have data integrity issue only for 31 records the above example is just for an example with data for few records for understanding (sid's are not in sequence)
I have tried the below query ,but this query is updating all the EFFECTIVE_DT_FROM
, EFFECTIVE_DT_TO
and version
, but I only need to update 31 records (since the sid's not in sequence my query won't work).
Another point to mention is basically I can hardcode the sids (loc_sid
) which is the primary key but for that I need to get 62 sid keys and include in the below condition after where city='Liverpool and loc_sid in (1234,4567......)
and run the query it works fine but I don't want some logic needs to be used, the loc_sids
are not in sequence, I was not able to figure out this
merge into test t
using (
select
t.*,
row_number() over(order by loc_sid asc) rn,
count(*) over() cnt
from test t
where city = 'Liverpool'
) t1
on (t1.loc_sid = t.loc_id)
when matched the update set
t.version = t1.rn,
t.effective_dt_from =
case
when rn = 1 then t.effective_dt_from
else date '2016-10-27' + rn - 2
end,
t.effective_dt_to =
case
when rn = cnt then t.effective_dt_to
else date '2016-10-27' + rn - 1
end
how to resolve this ?
来源:https://stackoverflow.com/questions/61628916/data-integrity-bug-query-fix-to-rewrite-the-sql