Rebuild sql query to sum date from two tables

情到浓时终转凉″ 提交于 2020-03-04 21:48:02

问题


I need a solution extension that I received and it is visible at the link How to get correct year, month and day in firebird function datediff. I have problem with connect data from two tables. I have got data with dates in two table KP and KPS. I know that I have to add data from second table in SQL query in that place where is definition of KP2 but I don`t know how to do this. Should I use join?

I have this SQL query:

SELECT
    KP3.id_contact,
    (KP3.D2-KP3.D1) / (12*31) AS Y,
    ((KP3.D2-KP3.D1) - ((KP3.D2-KP3.D1) / (12*31)) * 12 * 31) / 31 AS M,
    CAST(MOD((KP3.D2-KP3.D1) - (((KP3.D2-KP3.D1) / (12*31)) * 12 * 31), 31) AS INTEGER) AS D
FROM
    (SELECT
         KP2.id_contact, SUM(KP2.D1) AS D1, SUM(KP2.D2) AS D2
     FROM
         (SELECT
              KP.id_contact, 
              DATEDIFF(MONTH, KP.DATE_FROM, KP.DATE_TO) / 12 AS Y, 
              CAST(MOD(DATEDIFF(MONTH, KP.DATE_FROM, KP.DATE_TO), 12) AS INTEGER) AS M,
              EXTRACT(YEAR FROM KP.DATE_FROM)*12*31+EXTRACT(MONTH FROM KP.DATE_FROM)*31+EXTRACT(DAY FROM KP.DATE_FROM) D1,
              EXTRACT(YEAR FROM KP.DATE_TO)*12*31+EXTRACT(MONTH FROM KP.DATE_TO)*31+EXTRACT(DAY FROM KP.DATE_TO) D2 
          FROM
              KP) AS KP2
    GROUP BY 
        KP2.id_contact) AS KP3

I show this on example. I have data in table KP like this

ID    DATE_FROM    DATE_TO
------------------------------
1     2018-02-08   2019-12-01

and in table KPS I have data like this:

ID    DATE_FROM    DATE_TO
------------------------------
1     2017-02-20   2018-01-01

Result of query should be like this:

2Y 8M 7D

Please help me with this.


回答1:


Trying to guess what you could have mean.

Maybe you meant to pick the most earlier (MIN) date between all the source tables, then the most late and count days in between?

Or maybe you wanted to count days per-job in all source tables, and the nsum those days per-worker?

Dunno, here is BOTH solutions, built from simplistic blocks fuirther and further into end results. It also uses the data from the questions you link, for self-checking.

Here is the script you can tweak and see how it goes:

https://dbfiddle.uk/?rdbms=firebird_3.0&fiddle=6d99adde7194631bff47be49e5f92dc9

The results are either 2 years 9 months 2 days or 2 years 8 months 1 day depending upon the guesswork about what you meant by joining the source tables.

Cherry-pick the sub-queries that you need and cull away those you do not need.

select rdb$get_context('SYSTEM', 'ENGINE_VERSION') as version from rdb$database;
| VERSION |
| :------ |
| 3.0.5   |
-- https://stackoverflow.com/questions/60030543/rebuild-sql-query-to-sum-date-from-two-tables

create table KPS1 (
  ID integer not null,
  DATE_FROM date not null,
  DATE_TO date not null
)
create table KPS2 (
  ID integer not null,
  DATE_FROM date not null,
  DATE_TO date not null
)
create index KPS1_workers on KPS1(id)
create index KPS2_workers on KPS2(id)
insert into KPS1 values (1, '2018-02-08', '2019-12-01')
1 rows affected
insert into KPS2 values (1, '2017-02-20', '2018-01-01')
1 rows affected
-- this data sample taked from
-- https://stackoverflow.com/questions/51551257/how-to-get-correct-year-month-and-day-in-firebird-function-datediff
insert into KPS1 values (2, '1988-09-15', '2000-03-16')
1 rows affected
insert into KPS1 values (2, '2000-03-16', '2005-02-28')
1 rows affected
select * from KPS1 union all select * from KPS2
ID | DATE_FROM  | DATE_TO   
-: | :--------- | :---------
 1 | 2018-02-08 | 2019-12-01
 2 | 1988-09-15 | 2000-03-16
 2 | 2000-03-16 | 2005-02-28
 1 | 2017-02-20 | 2018-01-01
-- sadly, the topic starter did not say what he wants to do with his many sources of data
-- so multiple interpretations are possible!

-- here we are counting days from the first date to the last date, one row per worker

-- finding the minimum and maximum dates from ALL the sources
--  (multitude of rows in multitude of tables)
-- very simple to write and read, it however would be bad on long tables
-- because ID indexes hidden by UNION and not available for further, outer queries
--   FULL-SCAN in natural order and post-merge external sorting would occur

Select ID as id_contact, Min(DATE_FROM) as DATE_FROM, Max(DATE_TO) as DATE_TO
From (select * from KPS1 union all select * from KPS2)
Group by ID
ID_CONTACT | DATE_FROM  | DATE_TO   
---------: | :--------- | :---------
         1 | 2017-02-20 | 2019-12-01
         2 | 1988-09-15 | 2005-02-28
-- finding the minimum and maximum dates from ALL the sources
-- this one is harder to write and read
-- but should be better for execution: it allows use of indexes by ID be propagated

-- This optimized query works fine with the data presented by topic starter
-- where each source tables has exactly one row for one and the same worker.

-- It will not work so fine when some workers are missed from some tables.
-- Fixing it will make the query even more complex to write and read

Select KPS1.ID as id_contact, 
  IIF(KPS1.DATE_FROM < KPS2.DATE_FROM, KPS1.DATE_FROM, KPS2.DATE_FROM) as DATE_FROM,
  IIF(KPS2.DATE_TO < KPS2.DATE_TO, KPS2.DATE_TO, KPS1.DATE_TO) as DATE_TO
From KPS1, KPS2
Where KPS1.ID = KPS2.ID
ID_CONTACT | DATE_FROM  | DATE_TO   
---------: | :--------- | :---------
         1 | 2017-02-20 | 2019-12-01
Select ID_CONTACT, DateDiff(day, DATE_FROM, DATE_TO) as DAYS_COUNT From
(
  Select ID as id_contact, Min(DATE_FROM) as DATE_FROM, Max(DATE_TO) as DATE_TO
  From (select * from KPS1 union all select * from KPS2)
  Group by ID
)
ID_CONTACT | DAYS_COUNT
---------: | :---------
         1 | 1014      
         2 | 6010      
-- alternatively, here counting days per-job, many rows may happen for the same worker

select a.*, datediff(day, a.DATE_FROM, a.DATE_TO) as DAYS_COUNT from KPS1 a
  union all
select b.*, datediff(day, b.DATE_FROM, b.DATE_TO) as DAYS_COUNT from KPS2 b
ID | DATE_FROM  | DATE_TO    | DAYS_COUNT
-: | :--------- | :--------- | :---------
 1 | 2018-02-08 | 2019-12-01 | 661       
 2 | 1988-09-15 | 2000-03-16 | 4200      
 2 | 2000-03-16 | 2005-02-28 | 1810      
 1 | 2017-02-20 | 2018-01-01 | 315       
SELECT ID as ID_CONTACT, SUM(Days_Count) as DAYS_COUNT FROM
(
  select a.*, datediff(day, a.DATE_FROM, a.DATE_TO) as DAYS_COUNT from KPS1 a 
) 
GROUP BY ID_CONTACT

    union all

SELECT ID as ID_CONTACT, SUM(Days_Count) as DAYS_COUNT FROM
(
  select b.*, datediff(day, b.DATE_FROM, b.DATE_TO) as DAYS_COUNT from KPS2 b
)  
GROUP BY ID_CONTACT
ID_CONTACT | DAYS_COUNT
---------: | :---------
         1 | 661       
         2 | 6010      
         1 | 315       
WITH PER_SOURCE_SUMMER AS 
(
  SELECT ID as ID_CONTACT, SUM(Days_Count) as DAYS_COUNT FROM
  (
    select a.*, datediff(day, a.DATE_FROM, a.DATE_TO) as DAYS_COUNT from KPS1 a 
  ) 
  GROUP BY ID_CONTACT

    union all

  SELECT ID as ID_CONTACT, SUM(Days_Count) as DAYS_COUNT FROM
  (
    select b.*, datediff(day, b.DATE_FROM, b.DATE_TO) as DAYS_COUNT from KPS2 b
  )  
  GROUP BY ID_CONTACT
)

SELECT ID_CONTACT, SUM(Days_Count) as DAYS_COUNT 
FROM PER_SOURCE_SUMMER
GROUP BY 1
ID_CONTACT | DAYS_COUNT
---------: | :---------
         1 | 976       
         2 | 6010      
-- Now, having TWO interpretations of the task and TWO implementations of days counter
-- we finally can come up with conversion from precise but hard to feel DAYS 
-- to imprecise but easy to digest Y-M-D

WITH SOURCE_MIN_MAX AS
(
  Select ID_CONTACT, DateDiff(day, DATE_FROM, DATE_TO) as DAYS_COUNT From
  (
    Select ID as id_contact, Min(DATE_FROM) as DATE_FROM, Max(DATE_TO) as DATE_TO
    From (select * from KPS1 union all select * from KPS2)
    Group by ID
  )
),

PER_SOURCE_SUMMER AS 
(
  SELECT ID as ID_CONTACT, SUM(Days_Count) as DAYS_COUNT FROM
  (
    select a.*, datediff(day, a.DATE_FROM, a.DATE_TO) as DAYS_COUNT from KPS1 a 
  ) 
  GROUP BY ID_CONTACT

    union all

  SELECT ID as ID_CONTACT, SUM(Days_Count) as DAYS_COUNT FROM
  (
    select b.*, datediff(day, b.DATE_FROM, b.DATE_TO) as DAYS_COUNT from KPS2 b
  )  
  GROUP BY ID_CONTACT
),

SOURCE_PER_JOB AS
(
  SELECT ID_CONTACT, SUM(Days_Count) as DAYS_COUNT 
  FROM PER_SOURCE_SUMMER
  GROUP BY 1
),

KP_DAYS AS
(
  SELECT 1 as METHOD, A.* FROM SOURCE_MIN_MAX A
     union all
  SELECT 2 as METHOD, B.* FROM SOURCE_PER_JOB B
)

SELECT * from KP_DAYS
METHOD | ID_CONTACT | DAYS_COUNT
-----: | ---------: | :---------
     1 |          1 | 1014      
     1 |          2 | 6010      
     2 |          1 | 976       
     2 |          2 | 6010      
WITH SOURCE_MIN_MAX AS
(
  Select ID_CONTACT, DateDiff(day, DATE_FROM, DATE_TO) as DAYS_COUNT From
  (
    Select ID as id_contact, Min(DATE_FROM) as DATE_FROM, Max(DATE_TO) as DATE_TO
    From (select * from KPS1 union all select * from KPS2)
    Group by ID
  )
),

PER_SOURCE_SUMMER AS 
(
  SELECT ID as ID_CONTACT, SUM(Days_Count) as DAYS_COUNT FROM
  (
    select a.*, datediff(day, a.DATE_FROM, a.DATE_TO) as DAYS_COUNT from KPS1 a 
  ) 
  GROUP BY ID_CONTACT

    union all

  SELECT ID as ID_CONTACT, SUM(Days_Count) as DAYS_COUNT FROM
  (
    select b.*, datediff(day, b.DATE_FROM, b.DATE_TO) as DAYS_COUNT from KPS2 b
  )  
  GROUP BY ID_CONTACT
),

SOURCE_PER_JOB AS
(
  SELECT ID_CONTACT, SUM(Days_Count) as DAYS_COUNT 
  FROM PER_SOURCE_SUMMER
  GROUP BY 1
),

KP_DAYS AS
(
  SELECT 1 as METHOD, A.* FROM SOURCE_MIN_MAX A
     union all
  SELECT 2 as METHOD, B.* FROM SOURCE_PER_JOB B
)

SELECT
    KP_DAYS.method, KP_DAYS.id_contact, KP_DAYS.days_count,
    FLOOR(KP_DAYS.DAYS_COUNT / 365.25) AS Y
  , FLOOR( (KP_DAYS.DAYS_COUNT - (FLOOR(KP_DAYS.DAYS_COUNT / 365.25) * 365.25) ) / 30.5) AS M 
  , CAST(MOD((KP_DAYS.DAYS_COUNT) - (((KP_DAYS.DAYS_COUNT) / 365.25) * 365.25), 30.5) AS INTEGER) AS D
FROM KP_DAYS
METHOD | ID_CONTACT | DAYS_COUNT | Y  | M  |  D
-----: | ---------: | :--------- | :- | :- | -:
     1 |          1 | 1014       | 2  | 9  |  2
     1 |          2 | 6010       | 16 | 5  |  2
     2 |          1 | 976        | 2  | 8  |  1
     2 |          2 | 6010       | 16 | 5  |  2



回答2:


Maybe something like this :

DROP TABLE IF EXISTS #TestTableA, #TestTableB
CREATE TABLE #TestTableA (DFA DATE, DTA DATE)
CREATE TABLE #TestTableB (DFB DATE, DTB DATE)

INSERT INTO #TestTableA
SELECT '2018-02-08','2019-12-01'

INSERT INTO #TestTableB
SELECT '2017-02-20','2018-01-01'

SELECT  CAST(SUM(DaysQTY) / 365 AS VARCHAR(4)) + 'Y ' +
        CAST((SUM(DaysQTY) % 365) / 30 AS VARCHAR(2)) + 'M ' +
        CAST((SUM(DaysQTY) % 365) % 30 AS VARCHAR(2)) + 'D' AS Result
FROM    (   SELECT  DATEDIFF(DAY,DFA,DTA) AS DaysQTY
            FROM    #TestTableA
            UNION   ALL
            SELECT  DATEDIFF(DAY,DFB,DTB) AS DaysQTY
            FROM    #TestTableB
        )   t


来源:https://stackoverflow.com/questions/60030543/rebuild-sql-query-to-sum-date-from-two-tables

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!