Is there a more efficient way of doing this below? I want to have the difference in years between two dates as a single scalar. Any suggestions are welcome.
Here's a spin off of what Kostyantyn posted in his "age2" function. It's slightly shorter/cleaner and uses the traditional/colloquial meaning of an "age" or difference in years as well:
def ageInYears( d ):
today = datetime.date.today()
currentYrAnniversary = datetime.date( today.year, d.month, d.day )
return (today.year - d.year) - (1 if today < currentYrAnniversary else 0)
I use one of these to calculate person's age:
import datetime
dob = datetime.date(1980, 10, 10)
def age():
today = datetime.date.today()
years = today.year - dob.year
if today.month < dob.month or (today.month == dob.month and today.day < dob.day):
years -= 1
return years
def age2():
today = datetime.date.today()
this_year_birthday = datetime.date(today.year, dob.month, dob.day)
if this_year_birthday < today:
years = today.year - dob.year
else:
years = today.year - dob.year - 1
return years
To make sense of leap years, you are almost forced to break this into two parts: an integral number of years, and a fractional part. Both need to deal with leap years, but in different ways - the integral needs to deal with a starting date of February 29, and the fractional must deal with the differing number of days in a year. You want the fractional part to increment in equal amounts until it equals 1.0 at the next anniversary date, so it should be based on the number of days in the year after the end date.
Do you want your date range to include 1900 or 2100? Things get a little easier if you don't.
What's the difference between 2008-02-28 and 2009-02-28? Most people would agree that it should be exactly 1.0 years. How about the difference between 2008-03-01 and 2009-03-01? Again, most people would agree that it should be exactly 1.0 years. If you choose to represent a date as a year plus a fraction of a year based on the day, it is impossible to make both of these statements true. This is the case for your original code which assumed a day was 1/365.2425 of a year, or indeed for any code which assumes a constant fraction of a year per day, even if the size of a day accounts for the years which are leap years.
My assertion that you needed to break this down into integral years and fractional years was an attempt to get around this problem. If you treat each of the previous conditions as an integral year, all you have to do is decide on which fraction to assign to any number of days left over. The problem with this scheme is that you still can't make sense of (date2-date1)+date3, because the fraction can't be resolved back to a day with any consistency.
Thus I am proposing yet another encoding, based on each year containing 366 days whether it is a leap year or not. The anomalies will firstly be that there can't be a date which is exactly a year (or 2 or 3) from Feb. 29 - "Sorry Johnny, you don't get a birthday this year, there's no Feb. 29" isn't always acceptable. Second is that if you try to coerce such a number back to a date, you'll have to account for non-leap years and check for the special case of Feb. 29 and convert it, probably to Mar. 1.
from datetime import datetime
from datetime import timedelta
from calendar import isleap
size_of_day = 1. / 366.
size_of_second = size_of_day / (24. * 60. * 60.)
def date_as_float(dt):
days_from_jan1 = dt - datetime(dt.year, 1, 1)
if not isleap(dt.year) and days_from_jan1.days >= 31+28:
days_from_jan1 += timedelta(1)
return dt.year + days_from_jan1.days * size_of_day + days_from_jan1.seconds * size_of_second
start_date = datetime(2010,4,28,12,33)
end_date = datetime(2010,5,5,23,14)
difference_in_years = date_as_float(end_time) - date_as_float(start_time)
I'm not suggesting that this is the solution, because I don't think a perfect solution is possible. But it has some desirable properties:
Since we're coming to the end of 2018...
from dateutil import parser
from dateutil.relativedelta import relativedelta
rip = [
["Tim Bergling\t\t", " 8 Sep 1989", "20 Apr 2018"], # Avicii Swedish musician
["Stephen Hillenburg\t", "21 Aug 1961", "26 Nov 2018"], # Creator of Spongebob
["Stephen Hawking\t\t", " 8 Jan 1942", "14 Mar 2018"], # Theoretical physicist
["Stan Lee\t\t", "28 Dec 1922", "12 Nov 2018"], # American comic book writer
["Stefán Karl Stefánsson\t", "10 Jul 1975", "21 Aug 2018"] # Robbie Rotten from LazyTown
]
for name,born,died in rip:
print("%s %s\t %s\t died at %i"%(name,born,died,relativedelta(parser.parse(died),parser.parse(born)).years))
output
Tim Bergling 8 Sep 1989 20 Apr 2018 died at 28
Stephen Hillenburg 21 Aug 1961 26 Nov 2018 died at 57
Stephen Hawking 8 Jan 1942 14 Mar 2018 died at 76
Stan Lee 28 Dec 1922 12 Nov 2018 died at 95
Stefán Karl Stefánsson 10 Jul 1975 21 Aug 2018 died at 43
More efficient? No, but more correct, probably. But it depends on how correct you want to be. Dates are not trivial things.
Years do not have a constant length. Do you want the difference in leap years or normal years? :-) As you calculate you are always going to get a slightly incorrect answer. And how long is a day in years? You say 1/365.2425. Well, yeah, averaged over a thousand years, yeah. But otherwise not.
So the question doesn't really make much sense.
To be correct you have to do this:
from datetime import datetime
from calendar import isleap
start_date = datetime(2005,4,28,12,33)
end_date = datetime(2010,5,5,23,14)
diffyears = end_date.year - start_date.year
difference = end_date - start_date.replace(end_date.year)
days_in_year = isleap(end_date.year) and 366 or 365
difference_in_years = diffyears + (difference.days + difference.seconds/86400.0)/days_in_year
In this case that's a difference of 0.0012322917425568528 years, or 0.662 days, considering that this is not a leap year.
(and then we are ignoring microseconds. Heh.)
If you mean efficient in terms of code space then no, that's about the most efficient way to do that.