问题
I want to aggregate 2 fields proct_dt, dw_job_id in ascendinng order My scenario would be clear by using below queries and result.
First query :-
sel * from scratch.COGIPF_RUNREPORT_test1 order by proct_dt,dw_job_id where dw_job_id =10309
Output :-
dw_job_id proct_dt start_ts end_ts time_diff
1 10,309 2018-03-06 00:00:00 2018-03-06 07:04:18 2018-03-06 07:04:22.457000 0
2 10,309 2018-03-06 00:00:00 2018-03-06 06:58:50 2018-03-06 06:58:51.029000 0
3 10,309 2018-03-07 00:00:00 2018-03-07 06:35:36 2018-03-07 06:36:03.809000 1
4 10,309 2018-03-06 00:00:00 2018-03-06 07:00:35 2018-03-06 07:00:40.702000 0
5 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03-06 06:30:42.759000 0
6 10,309 2018-03-06 00:00:00 2018-03-06 07:10:27 2018-03-06 07:10:28.715000 0
7 10,309 2018-03-06 00:00:00 2018-03-06 06:59:44 2018-03-06 06:59:48.315000 0
8 10,309 2018-03-06 00:00:00 2018-03-06 07:00:15 2018-03-06 07:00:15.086000 0
9 10,309 2018-03-06 00:00:00 2018-03-06 07:04:02 2018-03-06 07:04:02.925000 0
2nd Query :-
sel * from scratch.fact_test order by proct_dt asc ,dw_job_id asc where dw_job_id =10309
Result :-
dw_job_id proct_dt start_ts end_ts status
1 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03-06 06:30:42.759 12
2 10,309 2018-03-07 00:00:00 2018-03-07 06:35:36 2018-03-07 06:36:03.809 12
So here in the 2nd query I got the desire result as from 1st table by having first occurrence of proct_dt,start_ts,end_ts
Please let me know for any clarification It would be very great if anyone can help to achieve this case.
Thanks,
回答1:
It's not clear what you want from your explanation, but it looks like you want to get the first run of a job per day, which is easy using a Row_Number:
select *
from scratch.COGIPF_RUNREPORT_test1
where dw_job_id =10309
qualify
row_number()
over (partition by dw_job_id, proc_dt -- for each job & date
order by start_ts) = 1 -- only the 1st run
来源:https://stackoverflow.com/questions/49174837/implement-aggregation-in-teradata