Efficient algorithm for converting number of days to years (including leap years)

后端 未结 10 1978
逝去的感伤
逝去的感伤 2021-02-05 17:04

The problem

I am writing a class for holding dates in c++, and I found the following problem:

I have a number of days N since a reference date (in

相关标签:
10条回答
  • 2021-02-05 17:19

    Here are a few pointers. Note: For this exercise I will assume that when N=0 that Y % 400 == 0.

    1: There are a fixed number of days in each 400 year period (400 * 365) + 100 + 1 - 4.

    The +100 is for the leap years, the +1 is for the leap year every 400 years and the -4 is for not having a leap year every 100 years.

    So your first line of code will be:

    GetDate(int N, int &Y, int &M, int &D) {
      const int DAYS_IN_400_YEARS = (400*365)+97;
      int year = (N / DAYS_IN_400_YEARS) * 400;
      N = N % DAYS_IN_400_YEARS;
    

    2: You can make your life a great deal easier if you treat March 1st as the first day of the year

    3: Adding to the code in (1), we can work out the year. Bear in mind that every fourth century begins with a leap year. So you can complete the calculation of the year with the following:

      const int DAYS_IN_100_YEARS = (100*365) + 24;
      year += 100 * (N / DAYS_IN_100_YEARS) + (N < DAYS_IN_100_YEARS ? 1 : 0); // Add an extra day for the first leap year that occurs every 400 years.
      N = N - (N < DAYS_IN_100_YEARS ? 1 : 0);
      N = N % DAYS_IN_400_YEARS;
    

    4: Now you've sorted out the years, the rest is easy as pie (just remember (2), and the process is easy).

    Alternatively you could use boost::date.

    0 讨论(0)
  • 2021-02-05 17:22

    To use the punchline of an old joke, "I wouldn't start from here".

    You'll want to read up about various changes to calendaring before "modern" times, for example, what happened in 1752

    0 讨论(0)
  • 2021-02-05 17:23

    I made a number of failed attempts at solving Gregorian date problems over the years. I developed this code about 15 years ago, and it continues to perform well. Because I wrote versions of this code so long ago, it's in native C, but is easily compiled into C++ programs. Feel free to wrap these in a Date class, if you like.

    My code is based on combining all the leap year rules into a 400-year cycle. Under Gregorian leap year rules, every 400-year cycle has exactly 146,097 days.

    An optimization I employed is to move January and February to the end of the prior year. This makes the leap day (if present) always fall on the last day of the year. That allows me to build a table (dayOffset) which provides the distance in days from March 1. Because the leap day would fall at the end, this table is accurate for leap- and non-leap-years.

    I'll begin with the header file.

    #if !defined( TIMECODE_H_ )
    #define TIMECODE_H_ 1
    
    #if defined(__cplusplus)
    extern "C" {
    #endif
    
    int dateCode( int month, int dayOfMonth, int year );
    
    void decodeDate( int *monthPtr, int *dayOfMonthPtr, int *yearPtr, int dateCode );
    
    int dayOfWeek( int dateCode );
    
    int cardinalCode( int nth, int weekday, int month, int year );
    
    enum Weekdays { eMonday, eTuesday, eWednesday, eThursday, eFriday, eSaturday, eSunday };
    
    #if defined(__cplusplus)
    }
    #endif
    
    #endif
    

    The API consists of four methods: dateCode() calculates the date code for a Gregorian date. decodeDate() calculates the Gregorian month, day and year from a date code. dayOfWeek() returns the day of the week for a date code. cardinalCode() returns the date code for a "cardinal" day of a specific month (for example, the 2nd Wednesday of August 2014).

    Here's the implementation:

    #include <math.h>
    
    enum
    {
       nbrOfDaysPer400Years = 146097,
       nbrOfDaysPer100Years = 36524,
       nbrOfDaysPer4Years = 1461,
       nbrOfDaysPerYear = 365,
       unixEpochBeginsOnDay = 135080
    };
    
    const int dayOffset[] =
    {
       0, 31, 61, 92, 122, 153, 184, 214, 245, 275, 306, 337, 366
    };
    
    /* ------------------------------------------------------------------------------------ */
    int mod( int dividend, int divisor, int* quotientPtr )
    {
       *quotientPtr = (int)floor( (double)dividend / divisor );
       return dividend - divisor * *quotientPtr;
    }
    
    /* ------------------------------------------------------------------------------------ */
    int dateCode( int month, int dayOfMonth, int year )
    {
       int days;
       int temp;
       int bYday;
       /*
       we take the approach of starting the year on March 1 so that leap days fall
       at the end. To do this we pretend Jan. - Feb. are part of the previous year.
       */
       int bYear = year - 1600;
       bYday = dayOffset[ mod( month - 3, 12, &temp ) ] + dayOfMonth - 1;
       bYear += temp;
    
       bYear = mod( bYear, 400, &days );
       days *= nbrOfDaysPer400Years;
    
       bYear = mod( bYear, 100, &temp );
       days += nbrOfDaysPer100Years * temp;
    
       bYear = mod( bYear, 4, &temp );
       days += nbrOfDaysPer4Years * temp + nbrOfDaysPerYear * bYear + bYday -
          unixEpochBeginsOnDay;
    
       return days;
    }
    
    /* ------------------------------------------------------------------------------------ */
    int dayOfWeek( int dateCode )
    {
       int temp;
       return mod( dateCode + 3, 7, &temp );
    }
    
    /* ------------------------------------------------------------------------------------ */
    void decodeDate( int *monthPtr, int *dayOfMonthPtr, int *yearPtr, int dateCode )
    {
       int diff;
       int diff2;
       int alpha;
       int beta;
       int gamma;
       int year;
       int temp;
    
       /* dateCode has the number of days relative to 1/1/1970, shift this back to 3/1/1600 */
       dateCode += unixEpochBeginsOnDay;
       dateCode = mod( dateCode, nbrOfDaysPer400Years, &temp );
       year = 400 * temp;
       dateCode = mod( dateCode, nbrOfDaysPer100Years, &temp );
       /* put the leap day at the end of 400-year cycle */
       if ( temp == 4 )
       {
          --temp;
          dateCode += nbrOfDaysPer100Years;
       }
       year += 100 * temp;
       dateCode = mod( dateCode, nbrOfDaysPer4Years, &temp );
       year += 4 * temp;
       dateCode = mod( dateCode, nbrOfDaysPerYear, &temp );
       /* put the leap day at the end of 4-year cycle */
       if ( temp == 4 )
       {
          --temp;
          dateCode += nbrOfDaysPerYear;
       }
       year += temp;
    
       /* find the month in the table */
       alpha = 0;
       beta = 11;
       gamma = 0;
       for(;;)
       {
          gamma = ( alpha + beta ) / 2;
          diff = dayOffset[ gamma ] - dateCode;
          if ( diff < 0 )
          {
             diff2 = dayOffset[ gamma + 1 ] - dateCode;
             if ( diff2 < 0 )
             {
                alpha = gamma + 1;
             }
             else if ( diff2 == 0 )
             {
                ++gamma;
                break;
             }
             else
             {
                break;
             }
          }
          else if ( diff == 0 )
          {
             break;
          }
          else
          {
             beta = gamma;
          }
       }
    
       if ( gamma >= 10 )
       {
          ++year;
       }
       *yearPtr = year + 1600;
       *monthPtr = ( ( gamma + 2 ) % 12 ) + 1;
       *dayOfMonthPtr = dateCode - dayOffset[ gamma ] + 1;
    }
    
    /* ------------------------------------------------------------------------------------ */
    int cardinalCode( int nth, int weekday, int month, int year )
    {
       int dow1st;
       int dc = dateCode( month, 1, year );
       dow1st = dayOfWeek( dc );
       if ( weekday < dow1st )
       {
          weekday += 7;
       }
       if ( nth < 0 || nth > 4 )
       {
          nth = 4;
       }
       dc += weekday - dow1st + 7 * nth;
       if ( nth == 4 )
       {
          /* check that the fifth week is actually in the same month */
          int tMonth, tDayOfMonth, tYear;
          decodeDate( &tMonth, &tDayOfMonth, &tYear, dc );
          if ( tMonth != month )
          {
             dc -= 7;
          }
       }
       return dc;
    }
    

    One issue with efficiency that will be immediately apparent is the mod() function. As you might expect, it provides the quotient and remainder of two integral dividends. C/C++ provides the modulus operator (%) which would seem to be a better choice; however, the standards don't specify how this operation should handle negative dividends. (See here for more information).

    There is probably a portable solution which uses efficient integer math; however, I've opted here for one that is slightly less efficient, but guaranteed correct on all platforms.

    A date code is simply an offset in days from a base date. I chose 1600-March-01 because it's the start of a 400-year Gregorian cycle that is early enough so that all the dates we are likely to encounter will result in a date code that is a positive integer. However, there's nothing incorrect about date codes before the base date. Since we're using a stable/portable modulo operation, all the math works well for negative date codes.

    Some don't like my non-standard base date, so I decided to adopt the standard Unix epoch, which begins 1970-January-01. I defined unixEpochBeginsOnDay to bias the date code to start on the desired date. If you want to use a different base date, you would replace this value with one you prefer.

    Calculating a date code is as simple as passing the month, dayOfMonth and year to dateCode():

    int dc = dateCode( 2, 21, 2001 );  // returns date code for 2001-Feb-21
    

    I've written dateCode so that it will accept values that are out of range for month and dayOfMonth. You can think of month as one plus the integer number of months after January of the given year. Here's a few tests to demonstrate:

    assert(dateCode( 14, 1, 2000 ) == dateCode( 2, 1, 2001 ));
    assert(dateCode( 5, 32, 2005 ) == dateCode( 6, 1, 2005 ));
    assert(dateCode( 0,  1, 2014 ) == dateCode(12, 1, 2013 ));
    

    Calling dateCode with non-canoncial month and dayOfMonth values, then converting back with decodeDate, is an effective way to canonicalize dates. For example:

    int m, d, y;
    decodeDate( &m, &d, &y, dateCode( 8, 20 + 90, 2014 ));
    printf("90 days after 2014-08-20 is %4d-%02d-%02d\n", y, m, d);
    

    The output should be:

    90 days after 2014-08-20 is 2014-11-18

    decodeDate() always produces canonical values for month and dayOfMonth.

    dayOfWeek() simply returns the modulus 7 of the dateCode, but I had to bias dateCode by 3 since 1970-January-01 was Thursday. If you prefer to start your week on a different day than Monday, then fix the Weekdays enum and change the bias as desired.

    cardinalCode() provides an interesting application of these methods. The first parameter provides the week number of the month ("nth"), and the second parameter provides the weekday. So to find the fourth Saturday in August 2007, you would:

    int m, d, y;
    decodeDate( &m, &d, &y, cardinalCode( 3, eSaturday, 8, 2007 ) );
    printf( "%d/%02d/%d\n", m, d, y );
    

    Which produces the answer:

    8/25/2007

    Note that the nth parameter, 3, in the example above specifies the fourth Saturday. I debated whether this parameter should be zero-based or one-based. For whatever reason, I settled on: 0=first, 1=second, 2=third, etc. Even the shortest months have four occurrences of every weekday. The value 4 has a special meaning. One would expect it to return the fifth occurrence of the requested weekday; however, since the month may or may not have a fifth occurrence, I decided to return the last occurrence of the month.

    For example, to display the last Monday of each month next year:

    int i, m, d, y;
    for (i=1; i <= 12; ++i) {
        decodeDate( &m, &d, &y, cardinalCode( 4, eMonday, i, 2015 ) );
        printf( "%d/%02d/%d\n", m, d, y );
    }
    

    One final example, illustrating one use for cardinalCode(), displaying the number of days until the next general election:

    #include <stdio.h>
    #include <time.h> /* only needed for time() and localtime() calls */
    #include "datecode.h"
    
    void main()
    {
       int eYear, eday, dc;
       int eY, eM, eD;
       time_t now;
       struct tm bdtm;
    
       time(&now);
       if (localtime_r(&now, &bdtm) == NULL) {
           printf("Error\n");
           return 1;
       }
       eYear = bdtm.tm_year + 1900;
       dc = dateCode(bdtm.tm_mon + 1, bdtm.tm_mday, eYear);
       if ((eYear % 2) != 0) {
           ++eYear;
       }
       for(;;) {
           eday = cardinalCode(0, eTuesday, 11, eYear);
           if (eday >= dc) break;
           eYear += 2;    /* move to the next election! */
       }
       decodeDate(&eM, &eD, &eY, eday);
       printf("Today is %d/%02d/%d\neday is %d/%02d/%d, %d days from today.\n",
               bdtm.tm_mon + 1, bdtm.tm_mday, bdtm.tm_year + 1900,
               eM, eD, eY, eday - dc);
    }
    
    0 讨论(0)
  • 2021-02-05 17:25

    To speed up the calculation of the year, you could build a lookup table

    int[] YearStartDays =
    {
        0,                     // 1 AD
        365,                   // 2 AD
        365 + 365,             // 3 AD
        365 + 365 + 365,       // 4 AD
        365 + 365 + 365 + 366, // 5 AD (4 was a leap year)
        /* ... */
    };
    

    You can then do a binary search in this array, which is O(log N) instead of the O(N) of your current year finding algorithm.

    0 讨论(0)
  • 2021-02-05 17:30

    Obviously, the bottleneck is the year calculation. I would suggest you doing this. When you initialize the calendar, approximate the year (very rougly) by dividing the days by 365. After that, pre-form a list of all leap years before this estimation. It should be rather fast since you don't need to count all of them, just add 4 years each time. Also, while doing them, count how much of such you have. Actually, you could even count them in larger packs (i.e. there are 100 leap years every 400 years), but you will need to check for the the leap year exceptions carefully, not to skip some of them.

    At the end of this, you will have the rough estimate of the year, and the amount of all leap years before it. Now you can count the precise year very easilly, without needing to iterate through anything:

    leapYearCount * 366 + (lastCalculatedYear - leapYearCount) * 365
    
    0 讨论(0)
  • 2021-02-05 17:31

    Why are you reinventing dates?

    Date math is well understood. The standard C library (that's right, C, not just C++) has had date functions for many years.

    As other's have indicated the boost date classes are also well designed and easy to use.

    When searching for an answer, the first question should be, is the problem already solved. This problem has been solved for many years.

    0 讨论(0)
提交回复
热议问题