问题
I have a Client data table, selected columns of which are shown below:
Row_ID Client_ID Status_ID From_date To_date
1 123456 4 20/12/2007 18:02 20/12/2007 18:07
2 789087 4 20/12/2007 18:02 20/12/2007 18:07
3 789087 4 20/12/2007 18:07 20/12/2007 18:50
4 789087 4 20/12/2007 18:50 21/12/2007 10:38
5 123456 4 20/12/2007 18:07 20/12/2007 18:50
6 123456 4 20/12/2007 18:50 21/12/2007 10:38
7 123456 4 21/12/2007 10:38 21/12/2007 16:39
8 789087 4 21/12/2007 10:38 21/12/2007 17:54
9 789087 4 21/12/2007 17:54 21/12/2007 18:32
10 789087 4 21/12/2007 18:32 22/12/2007 06:48
11 123456 5 21/12/2007 16:39
12 789087 5 22/12/2007 06:48 22/12/2007 10:53
13 789087 4 22/12/2007 10:53 22/12/2007 11:51
14 789087 5 22/12/2007 11:51
After putting the data into ascending order by Client_ID and then by From_date, my objective is to add a calculated Rank_ID every time there is a change in the status for that client when comparing the status to the previous line. The desired values I want for the Rank_ID are shown below:
Row_ID Client_ID Status_ID From_date To_date Rank_ID
1 123456 4 20/12/2007 18:02 20/12/2007 18:07 1
5 123456 4 20/12/2007 18:07 20/12/2007 18:50 1
6 123456 4 20/12/2007 18:50 21/12/2007 10:38 1
7 123456 4 21/12/2007 10:38 21/12/2007 16:39 1
11 123456 5 21/12/2007 16:39 2
2 789087 4 20/12/2007 18:02 20/12/2007 18:07 3
3 789087 4 20/12/2007 18:07 20/12/2007 18:50 3
4 789087 4 20/12/2007 18:50 21/12/2007 10:38 3
8 789087 4 21/12/2007 10:38 21/12/2007 17:54 3
9 789087 4 21/12/2007 17:54 21/12/2007 18:32 3
10 789087 4 21/12/2007 18:32 22/12/2007 06:48 3
12 789087 5 22/12/2007 06:48 22/12/2007 10:53 4
13 789087 4 22/12/2007 10:53 22/12/2007 11:51 5
14 789087 5 22/12/2007 11:51 6
I am trying to use DENSE_RANK as an analytical function, my "incorrect" SQL code being below
SELECT t1.*, DENSE_RANK () OVER (ORDER BY t1.client_id, t1.status_id) rank_id
FROM (SELECT c.client_ID, c.status_id, c.from_date, c.to_date
FROM client c
ORDER BY c.client_id, c.from_date) t1
ORDER BY t1.client_id, t1.from_date
However, the problem I am encountering is that it given the SQL code as written it calculates the Rank_ID as follows:
Row_ID Client_ID Status_ID From_date To_date Rank_ID
1 123456 4 20/12/2007 18:02 20/12/2007 18:07 1
5 123456 4 20/12/2007 18:07 20/12/2007 18:50 1
6 123456 4 20/12/2007 18:50 21/12/2007 10:38 1
7 123456 4 21/12/2007 10:38 21/12/2007 16:39 1
11 123456 5 21/12/2007 16:39 2
2 789087 4 20/12/2007 18:02 20/12/2007 18:07 3
3 789087 4 20/12/2007 18:07 20/12/2007 18:50 3
4 789087 4 20/12/2007 18:50 21/12/2007 10:38 3
8 789087 4 21/12/2007 10:38 21/12/2007 17:54 3
9 789087 4 21/12/2007 17:54 21/12/2007 18:32 3
10 789087 4 21/12/2007 18:32 22/12/2007 06:48 3
12 789087 5 22/12/2007 06:48 22/12/2007 10:53 4
13 789087 4 22/12/2007 10:53 22/12/2007 11:51 3
14 789087 5 22/12/2007 11:51 4
For record 13, the returned Rank_ID is 3 (whereas I want 5 at it is a change in status for that client when compared to the status on the previous record for that client) and for record 14 the returned Rank_ID is 4, whereas I want 6 as it is again a change in status for that client when compared to the previous line.
I guess the problem is that my SQL orders the data by Client_ID and then by Status_ID, and so I can see why it produces the answers it gives. The problem is no matter what changes I make to the DENSE_RANK line, I can't obtain the answer I want.
Any help would be appreciated.
回答1:
As I understood, this is what you need:
select client_ID, status_id, from_date, to_date,
sum(start_of_group) over (order by client_ID, from_date) + 1 rank
from (SELECT c.client_ID, c.status_id, c.from_date, c.to_date,
case when lag(c.client_ID, 1, c.client_ID) over (order by c.client_ID, c.from_date) = c.client_ID
and lag(c.status_id, 1, c.status_id) over (order by c.client_ID, c.from_date) = c.status_id
then 0 else 1 end start_of_group
FROM client c)
order by client_ID, from_date
SQLFiddle
回答2:
the thing is that you need to partition your ranking over the CHANGE of the status, not the VALUE of the status. I'm leaving some extra columns in the output so that you can see how it is all derived:
WITH dat as (
SELECT 1 row_id, 123456 client_id, 4 status, to_date('20/12/2007 18:02','dd/mm/yyyy hh24:mi') frdate, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 2 row_id, 789087 client_id, 4 status, to_date('20/12/2007 18:02','dd/mm/yyyy hh24:mi') frdate, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 3 row_id, 789087 client_id, 4 status, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') frdate, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 4 row_id, 789087 client_id, 4 status, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') frdate, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 5 row_id, 123456 client_id, 4 status, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') frdate, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 6 row_id, 123456 client_id, 4 status, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') frdate, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 7 row_id, 123456 client_id, 4 status, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') frdate, to_date('21/12/2007 16:39','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 8 row_id, 789087 client_id, 4 status, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') frdate, to_date('21/12/2007 17:54','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 9 row_id, 789087 client_id, 4 status, to_date('21/12/2007 17:54','dd/mm/yyyy hh24:mi') frdate, to_date('21/12/2007 18:32','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 10 row_id, 789087 client_id, 4 status, to_date('21/12/2007 18:32','dd/mm/yyyy hh24:mi') frdate, to_date('22/12/2007 06:48','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 11 row_id, 123456 client_id, 5 status, to_date('21/12/2007 16:39','dd/mm/yyyy hh24:mi') frdate, null from dual union all
SELECT 12 row_id, 789087 client_id, 5 status, to_date('22/12/2007 06:48','dd/mm/yyyy hh24:mi') frdate, to_date('22/12/2007 10:53','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 13 row_id, 789087 client_id, 4 status, to_date('22/12/2007 10:53','dd/mm/yyyy hh24:mi') frdate, to_date('22/12/2007 11:51','dd/mm/yyyy hh24:mi') todate from dual union all
SELECT 14 row_id, 789087 client_id, 5 status, to_date('22/12/2007 11:51','dd/mm/yyyy hh24:mi') frdate, null from dual)
SELECT t1.*, DENSE_RANK () OVER (ORDER BY t1.client_id, t1.chg_status) rank_id
FROM (select client_id, status, prev_status, sum(case when nvl(prev_status,-1) != status then 1 else 0 end) over (partition by client_id order by frdate) chg_status, frdate, todate
from (
SELECT c.client_ID
, c.status
, lag(status) over (partition by client_id order by frdate) as prev_status
, c.frdate
, c.todate
FROM dat c
ORDER BY c.client_id, c.frdate)) t1
ORDER BY t1.client_id, t1.frdate
Returns:
CLIENT_ID, STATUS, PREV_STATUS, CHG_STATUS, FRDATE, TODATE, RANK_ID
123456, 4, , 1, 20/12/2007 6:02:00 PM, 20/12/2007 6:07:00 PM, 1
123456, 4, 4, 1, 20/12/2007 6:07:00 PM, 20/12/2007 6:50:00 PM, 1
123456, 4, 4, 1, 20/12/2007 6:50:00 PM, 21/12/2007 10:38:00 AM, 1
123456, 4, 4, 1, 21/12/2007 10:38:00 AM, 21/12/2007 4:39:00 PM, 1
123456, 5, 4, 2, 21/12/2007 4:39:00 PM,, 2
789087, 4, , 1, 20/12/2007 6:02:00 PM,20/12/2007 6:07:00 PM, 3
789087, 4, 4, 1, 20/12/2007 6:07:00 PM,20/12/2007 6:50:00 PM, 3
789087, 4, 4, 1, 20/12/2007 6:50:00 PM, 21/12/2007 10:38:00 AM, 3
789087, 4, 4, 1, 21/12/2007 10:38:00 AM, 21/12/2007 5:54:00 PM, 3
789087, 4, 4, 1, 21/12/2007 5:54:00 PM, 21/12/2007 6:32:00 PM, 3
789087, 4, 4, 1, 21/12/2007 6:32:00 PM,22/12/2007 6:48:00 AM, 3
789087, 5, 4, 2, 22/12/2007 6:48:00 AM, 22/12/2007 10:53:00 AM, 4
789087, 4, 5, 3, 22/12/2007 10:53:00 AM, 22/12/2007 11:51:00 AM, 5
789087, 5, 4, 4, 22/12/2007 11:51:00 AM,, 6
回答3:
Simple mark all status changes for each client with 1
(column GRP below).
Than add those number using analytic SUM function:
with tab1 as (
select client_id,from_date, status,
nvl(lag(status) over (partition by client_id order by from_date),-1) status_lag,
case when (nvl(lag(status) over (partition by client_id order by from_date),-1) <> status) then
1 end grp
from tst
)
, tab2 as (
select client_id,from_date, status,status_lag, grp,
sum(grp) over (partition by client_id order by from_date) as RANK
from tab1
)
select * from tab2;
gives as expected
CLIENT_ID FROM_DATE STATUS STATUS_LAG GRP RANK
---------- ------------------- ---------- ---------- ---------- ----------
1001 01.10.2015 00:00:00 1 -1 1 1
1001 02.10.2015 00:00:00 1 1 1
1001 03.10.2015 00:00:00 2 1 1 2
1001 04.10.2015 00:00:00 2 2 2
1001 05.10.2015 00:00:00 3 2 1 3
1001 09.10.2015 00:00:00 1 3 1 4
1002 12.10.2015 00:00:00 1 -1 1 1
1002 13.10.2015 00:00:00 3 1 1 2
1002 15.10.2015 00:00:00 3 3 2
my setup
create table tst
(client_id number,
from_date date,
status number);
insert into tst values (1001, to_date('01-10-15','dd-mm-rr'),1);
insert into tst values (1001, to_date('02-10-15','dd-mm-rr'),1);
insert into tst values (1001, to_date('03-10-15','dd-mm-rr'),2);
insert into tst values (1001, to_date('04-10-15','dd-mm-rr'),2);
insert into tst values (1001, to_date('05-10-15','dd-mm-rr'),3);
insert into tst values (1001, to_date('09-10-15','dd-mm-rr'),1);
insert into tst values (1002, to_date('12-10-15','dd-mm-rr'),1);
insert into tst values (1002, to_date('13-10-15','dd-mm-rr'),3);
insert into tst values (1002, to_date('15-10-15','dd-mm-rr'),3);
commit;
回答4:
Here's a solution using Tabibitosan:
with client as (select 1 row_id, 123456 client_id, 4 status_id, to_date('20/12/2007 18:02','dd/mm/yyyy hh24:mi') from_date, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') todate from dual union all
select 2 row_id, 789087 client_id, 4 status_id, to_date('20/12/2007 18:02','dd/mm/yyyy hh24:mi') from_date, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') todate from dual union all
select 3 row_id, 789087 client_id, 4 status_id, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') from_date, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') todate from dual union all
select 4 row_id, 789087 client_id, 4 status_id, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') todate from dual union all
select 5 row_id, 123456 client_id, 4 status_id, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') from_date, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') todate from dual union all
select 6 row_id, 123456 client_id, 4 status_id, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') todate from dual union all
select 7 row_id, 123456 client_id, 4 status_id, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 16:39','dd/mm/yyyy hh24:mi') todate from dual union all
select 8 row_id, 789087 client_id, 4 status_id, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 17:54','dd/mm/yyyy hh24:mi') todate from dual union all
select 9 row_id, 789087 client_id, 4 status_id, to_date('21/12/2007 17:54','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 18:32','dd/mm/yyyy hh24:mi') todate from dual union all
select 10 row_id, 789087 client_id, 4 status_id, to_date('21/12/2007 18:32','dd/mm/yyyy hh24:mi') from_date, to_date('22/12/2007 06:48','dd/mm/yyyy hh24:mi') todate from dual union all
select 11 row_id, 123456 client_id, 5 status_id, to_date('21/12/2007 16:39','dd/mm/yyyy hh24:mi') from_date, null from dual union all
select 12 row_id, 789087 client_id, 5 status_id, to_date('22/12/2007 06:48','dd/mm/yyyy hh24:mi') from_date, to_date('22/12/2007 10:53','dd/mm/yyyy hh24:mi') todate from dual union all
select 13 row_id, 789087 client_id, 4 status_id, to_date('22/12/2007 10:53','dd/mm/yyyy hh24:mi') from_date, to_date('22/12/2007 11:51','dd/mm/yyyy hh24:mi') todate from dual union all
select 14 row_id, 789087 client_id, 5 status_id, to_date('22/12/2007 11:51','dd/mm/yyyy hh24:mi') from_date, null from dual)
select row_id,
client_id,
status_id,
from_date,
todate,
dense_rank() over (order by client_id, status_id, grp) rank_id
from (select row_id,
client_id,
status_id,
from_date,
todate,
row_number() over (order by client_id, from_date) - row_number() over (partition by status_id order by client_id, from_date) grp -- this is the tabibitosan step
from client) t1
order by client_id, from_date, status_id;
ROW_ID CLIENT_ID STATUS_ID FROM_DATE TODATE RANK_ID
---------- ---------- ---------- ---------------- ---------------- ----------
1 123456 4 20/12/2007 06:02 20/12/2007 06:07 1
5 123456 4 20/12/2007 06:07 20/12/2007 06:50 1
6 123456 4 20/12/2007 06:50 21/12/2007 10:38 1
7 123456 4 21/12/2007 10:38 21/12/2007 04:39 1
11 123456 5 21/12/2007 04:39 2
2 789087 4 20/12/2007 06:02 20/12/2007 06:07 3
3 789087 4 20/12/2007 06:07 20/12/2007 06:50 3
4 789087 4 20/12/2007 06:50 21/12/2007 10:38 3
8 789087 4 21/12/2007 10:38 21/12/2007 05:54 3
9 789087 4 21/12/2007 05:54 21/12/2007 06:32 3
10 789087 4 21/12/2007 06:32 22/12/2007 06:48 3
12 789087 5 22/12/2007 06:48 22/12/2007 10:53 5
13 789087 4 22/12/2007 10:53 22/12/2007 11:51 4
14 789087 5 22/12/2007 11:51 6
来源:https://stackoverflow.com/questions/34748397/oracle-sql-dense-rank