How to get correct year, month and day in firebird function datediff

前端 未结 2 1721
粉色の甜心
粉色の甜心 2021-01-25 01:31

I have to ask another question about datediff in Firebird. I don`t know how to get correct result in this case: worker x has two contract of employment, first in the period 1988

2条回答
  •  梦毁少年i
    2021-01-25 01:43

    The proper approach seems to measure the DAYS spent on both assignments, then sum those dates, then convert it into inherently non-precise give-or-take streeet-talk-like form of years-months-days. More on this below

    Borrowing the conversion query from Livius, and adjusting coefficients to more realistic, that will develop as that:

    https://dbfiddle.uk/?rdbms=firebird_3.0&fiddle=2fba0ace6a70ae16a167ec838642dc28

    Here, step by step we move, building up from simple blocks into more and more complex ones, which finally gives us 16 years and 5 months and 2 days:

    select rdb$get_context('SYSTEM', 'ENGINE_VERSION') as version from rdb$database;
    
    | VERSION |
    | :------ |
    | 3.0.5   |
    
    create table KP (
      ID_CONTACT integer not null,
      DATE_FROM date not null,
      DATE_TO date not null
    )
    -- https://stackoverflow.com/questions/51551257/how-to-get-correct-year-month-and-day-in-firebird-function-datediff
    
    create index KP_workers on KP(id_contact)
    
    insert into KP values (1, '1988-09-15', '2000-03-16')
    
    1 rows affected
    
    insert into KP values (1, '2000-03-16', '2005-02-28')
    
    1 rows affected
    
    -- the sample data from https://stackoverflow.com/questions/60030543
    -- might expose the rounding bug in my original formulae:
    -- unexpected ROUNDING UP leading to NEGATIVE value for months
    
    insert into KP values (2, '2018-02-08', '2019-12-01')
    
    1 rows affected
    
    insert into KP values (2, '2017-02-20', '2018-01-01')
    
    1 rows affected
    
    select a.*, datediff(day, a.DATE_FROM, a.DATE_TO) as DAYS_COUNT from KP a
    
    ID_CONTACT | DATE_FROM  | DATE_TO    | DAYS_COUNT
    ---------: | :--------- | :--------- | :---------
             1 | 1988-09-15 | 2000-03-16 | 4200      
             1 | 2000-03-16 | 2005-02-28 | 1810      
             2 | 2018-02-08 | 2019-12-01 | 661       
             2 | 2017-02-20 | 2018-01-01 | 315       
    
    -- Original answer by Livius
    
    SELECT
    KP3.id_contact 
    , KP3.D2-KP3.D1 as days_count
    , (KP3.D2-KP3.D1) / (12*31) AS Y
    , ((KP3.D2-KP3.D1) - ((KP3.D2-KP3.D1) / (12*31)) * 12 * 31) / 31 AS M
    , CAST(MOD((KP3.D2-KP3.D1) - (((KP3.D2-KP3.D1) / (12*31)) * 12 * 31), 31) AS INTEGER) AS D
    FROM
    (SELECT
    KP2.id_contact, SUM(KP2.D1) AS D1, SUM(KP2.D2) AS D2
    FROM
        (
        SELECT
        KP.id_contact, DATEDIFF(MONTH, KP.DATE_FROM, KP.DATE_TO) / 12 AS Y, CAST(MOD(DATEDIFF(MONTH, KP.DATE_FROM, KP.DATE_TO), 12) AS INTEGER) AS M 
        , EXTRACT(YEAR FROM KP.DATE_FROM)*12*31+EXTRACT(MONTH FROM KP.DATE_FROM)*31+EXTRACT(DAY FROM KP.DATE_FROM) D1
        , EXTRACT(YEAR FROM KP.DATE_TO)*12*31+EXTRACT(MONTH FROM KP.DATE_TO)*31+EXTRACT(DAY FROM KP.DATE_TO) D2 
        FROM
        KP  
        ) AS KP2
    GROUP BY KP2.id_contact
    ) AS KP3
    
    ID_CONTACT | DAYS_COUNT | Y  | M  |  D
    ---------: | :--------- | :- | :- | -:
             1 | 6120       | 16 | 5  | 13
             2 | 997        | 2  | 8  |  5
    
    select ID_CONTACT, sum(DAYS_COUNT) as DAYS_COUNT
    from (
      select a.*, datediff(day, a.DATE_FROM, a.DATE_TO) as DAYS_COUNT from KP a
    )
    GROUP BY 1
    
    ID_CONTACT | DAYS_COUNT
    ---------: | :---------
             1 | 6010      
             2 | 976       
    
    -- this step taken from https://dbfiddle.uk/?rdbms=firebird_3.0&fiddle=52c1e130f589ca507c9ff185b5b2346d
    
    -- based on original Livius forumla with non-exact integer coefficients
    -- it seems not be generating negative counts, but still shows very different results
    
    SELECT
        KP_DAYS.id_contact,
        KP_DAYS.DAYS_COUNT / (12*31) AS Y,
        ((KP_DAYS.DAYS_COUNT) - ((KP_DAYS.DAYS_COUNT) / (12*31)) * 12 * 31) / 31 AS M,
        CAST(MOD((KP_DAYS.DAYS_COUNT) - (((KP_DAYS.DAYS_COUNT) / (12*31)) * 12 * 31), 31) AS INTEGER) AS D
    FROM
    (
      select ID_CONTACT, sum(DAYS_COUNT) as DAYS_COUNT
      from (
        select a.*, datediff(day, a.DATE_FROM, a.DATE_TO) as DAYS_COUNT from KP a
      )
      GROUP BY 1  
    ) as KP_DAYS
    
    ID_CONTACT | Y  | M  |  D
    ---------: | :- | :- | -:
             1 | 16 | 1  | 27
             2 | 2  | 7  | 15
    
    SELECT
        KP_DAYS.id_contact, KP_DAYS.days_count
      , FLOOR(KP_DAYS.DAYS_COUNT / 365.25) AS Y
      , FLOOR( (KP_DAYS.DAYS_COUNT - (FLOOR(KP_DAYS.DAYS_COUNT / 365.25) * 365.25) ) / 30.5) AS M 
      , CAST(MOD((KP_DAYS.DAYS_COUNT) - (((KP_DAYS.DAYS_COUNT) / 365.25) * 365.25), 30.5) AS INTEGER) AS D
    FROM
    (
      select ID_CONTACT, sum(DAYS_COUNT) as DAYS_COUNT
      from (
        select a.*, datediff(day, a.DATE_FROM, a.DATE_TO) as DAYS_COUNT from KP a
      )
      GROUP BY 1  
    ) as KP_DAYS
    
    ID_CONTACT | DAYS_COUNT | Y  | M  |  D
    ---------: | :--------- | :- | :- | -:
             1 | 6010       | 16 | 5  |  2
             2 | 976        | 2  | 8  |  1
    

    Notice, the above is still not correct mathematically. But should give a "gut feeling" of the time stamp. The question of getting EXACT AND PRECISE measure of timespan in Y-M-D form is moot.

    For example, you quoted 3 days while this query gives 2 days. I see no error there. Because months and years are different from each other you just can not correctly measure time DISTANCE in months. That would be like measuring geographical distance in cities.

    How many New Yorks lie between London and Paris? How many Warsaws high is Elbrus mountain? You can not have any mathematically correct answer.

    Thus you can only answer with NON-PRECISE estimations. Suitable for give-or-take kind of street talk. So, any DateDiff-based query would essentially generate a perfectly valid answer of the kind of "2Y 10M give or take few days" - an answer that IS valid for this context of "just give me overall impression".

    Marrying this simplicity of getting a feel of it with perfectionism of mathematical accuracy just is not possible. For example, imagine you get the span of about 6Y. Now how many leap years should you account for? In the "6Y" from 1999 to 2004 there were TWO leap years, but in the same "6Y" from 1998 to 2003 there only was ONE leap year. Which of those is correct measure for "6Y" ???

    And then we have milleniums, where 2000 was leap year but 1900 was not. And same "sliding window" problem gives you volatile undefined number of leap years in timespans like "110Y". If you want to go towards layman's perception and count timespans in "years and months" - you have to agree this makes things easy, simple and imprecise by definition. And mismatch of one or few days over several years is norm, is OK

提交回复
热议问题