问题
Assuming I have two tables, one with the following columns called lease_period:
tenant_trading_name, suite_id, lease_id, building_id
and another, called lease_period_audit with the following:
audit_date, audit_type, tenant_trading_name, suite_id, lease_id, building_id
Each time a record is updated on lease_period and entry is made on lease_period_audit with a status of 'Updated'. I'm trying to find all updates made only to the tenant_trading_name field but haven't had any success. So far I have the following:
select lpa.*
from property.lease_period_audit lpa
inner join property.lease_period lp on lpa.suite_id = lp.suite_id and lpa.lease_id = lp.lease_id and lpa.building_id = lp.building_id
where audit_type = 'Updated'
and lp.tenant_trading_name <> lpa.tenant_trading_name
order by 1 desc
Where's the flaw in my thought process here? How can this be done / how should I be thinking about this?
回答1:
Assuming the audit table also logs the lease_period
primary key column, referenced here lp_id
for simplicity, you could try the following approach:
Find all the rows where
audit_type
is'Updated'
.Rank all the rows by
audit_date
and partitioning them bylp_id
.Rank the rows by
audit_date
partitioning bylp_id, suite_id, lease_id, building_id
.Get the difference between the two rankings.
Rank the rows again by
audit_date
, partitioning them now bylp_id, suite_id, lease_id, building_id, (ranking_difference)
.Output all the rows where the last ranking value is 2 or greater.
The first four steps result in a row set where each group of consecutive (in ascending order of audit_date
) rows with identical values of suite_id, lease_id, building_id
for the same lp_id
will be uniquely distinguished by a value calculated as the difference between the rankings #2 & #3.
Within the group, every row, starting from the second one, will differ from the previous one only in the value of tenant_trading_name
, which is just what we need. So, we rank the rows once again, taking into account the 'group ID' we've just obtained, then return every row with the ranking of 2 or higher.
Here's an approximate implementation:
WITH marked AS (
SELECT
*,
grp = ROW_NUMBER() OVER (PARTITION BY lp_id
ORDER BY audit_date)
- ROW_NUMBER() OVER (PARTITION BY lp_id, suite_id, lease_id, building_id
ORDER BY audit_date)
FROM lease_period_audit
WHERE audit_type = 'Updated'
),
ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (PARTITION BY lp_id, suite_id, lease_id, building_id, grp
ORDER BY audit_date)
FROM marked
)
SELECT
audit_date,
lp_id,
tenant_trading_name,
suite_id,
lease_id,
building_id
FROM ranked
WHERE rnk = 2
Note. This assumes that the audit table logs only real changes, i.e. there can't be two consecutive rows with the same primary key where all four columns have identical values.
回答2:
You should think something like this (pseudo-code):
edit: I haven't realized before that the lpa table actually has all the data, there's no need to join with LP
select lpa.*
from lpa
join lpa_before on
lpa_before.id = lpa.id and
lpa_before.date =
(select max(date) from lpa3 where lpa3.date < lpa.date and lpa.id = lpa3.id)
where auditytype = 'update' and lpa.name <> lpa_before.name
I hope I could explain.. (it's not a trivial solution, but thats what comes to my mind)
In English:
Select the LPA registers. Join with another LPA that will be the LPA just before the original one. Call this LPA_BEFORE. To join the LPA_BEFORE you have to compare all ids from LPA and LPA_BEFORE, and the LPA_BEFORE date must be the maximum from the LPAs which have a date less than the original LPA. Compare the names of the LPA with LPA_BEFORE
来源:https://stackoverflow.com/questions/8736471/query-help-when-using-audit-table