问题
I've got an array of dates that can contain multiple date ranges in it.
dates = [
'2020-01-01',
'2020-01-02',
'2020-01-03',
'2020-01-06',
'2020-01-07',
'2020-01-08'
]
In this example, the list contains 2 separate consecutive date ranges (2020-01-01 to 2020-01-03 & 2020-01-06 to 2020-01-08)
I'm attempting to figure out how I would loop through this list and find all the consecutive date ranges.
One of the articles I'm looking at (How to detect if dates are consecutive in Python?) seems to have a good approach, however, I'm struggling to implement this logic in my use case.
回答1:
More itertools has a function called consecutive_groups that does this for you:
Or you can view the source code and copy it's approach:
from datetime import datetime
from itertools import groupby
from operator import itemgetter
def consecutive_groups(iterable, ordering=lambda x: x):
for k, g in groupby(enumerate(iterable), key=lambda x: x[0] - ordering(x[1])):
yield map(itemgetter(1), g)
for g in consecutive_groups(dates, lambda x: datetime.strptime(x, '%Y-%m-%d').toordinal()):
print(list(g))
['2020-01-01', '2020-01-02', '2020-01-03']
['2020-01-06', '2020-01-07', '2020-01-08']
回答2:
This assumes that single-date "ranges" are still represented by 2 dates:
def makedate(s):
return datetime.strptime( s, "%Y-%m-%d" )
def splitIntoRanges( dates ):
ranges = []
start_s = last_s = dates[0]
last = makedate(start_s)
for curr_s in dates[1:]:
curr = makedate(curr_s)
if (curr - last).days > 1:
ranges.append((start_s,last_s))
start_s = curr_s
last_s = curr_s
last = curr
return ranges + [(start_s,last_s)]
回答3:
I took a similar, though definitely not quite as elegant approach as @Scott:
ranges = []
dates = [datetime.strptime(date, '%Y-%m-%d') for date in dates]
start = dates[0]
for i in range(1, len(dates)):
if (dates[i] - dates[i-1]).days == 1 and i==len(dates)-1:
end = dates[i]
ranges.append(f'{start} to {end}')
start = dates[i]
elif (dates[i] - dates[i - 1]).days > 1:
end = dates[i - 1]
ranges.append(f'{start} to {end}')
start = dates[i]
else:
continue
回答4:
I found the key to my solution in a second post and pieced it together.
There are two parts to my issue:
- How do I represent a list of dates in an effective manner
Answer: https://stackoverflow.com/a/9589929/2150673
pto = [
'2020-01-03',
'2020-01-08',
'2020-01-02',
'2020-01-07',
'2020-01-01',
'2020-01-06'
]
ordinal_dates = [datetime.datetime.strptime(i, '%Y-%m-%d').toordinal() for i in pto]
- Once you have a list of dates in integer representation, you can simply look for consecutive integers and get the upper and lower bounds of each range, and then convert back to yyyy-mm-dd format.
Answer: https://stackoverflow.com/a/48106843
def ranges(nums):
nums = sorted(set(nums))
gaps = [[s, e] for s, e in zip(nums, nums[1:]) if s+1 < e]
edges = iter(nums[:1] + sum(gaps, []) + nums[-1:])
return list(zip(edges, edges))
My complete function:
def get_date_ranges(pto_list: list) -> list:
pto_dates = [datetime.datetime.strptime(i, '%Y-%m-%d').toordinal() for i in pto_list]
nums = sorted(set(pto_dates))
gaps = [[s, e] for s, e in zip(nums, nums[1:]) if s + 1 < e]
edges = iter(nums[:1] + sum(gaps, []) + nums[-1:])
ordinal_ranges = list(zip(edges, edges))
date_bounds = []
for start, end in ordinal_ranges:
date_bounds.append((
datetime.datetime.fromordinal(start).strftime('%Y-%m-%d'),
datetime.datetime.fromordinal(end).strftime('%Y-%m-%d')
))
return date_bounds
回答5:
You can find all the consecutive date ranges and append them to a list of list and access your ranges based on the index but I prefer using keys within a dictionary for readability.
Here is how: (note: please read comments)
dates = [datetime.strptime(d, "%Y-%m-%d") for d in dates] # new datetime parsed from a string
date_ints = [d.toordinal() for d in dates] # toordinal() returns the day count from the date 01/01/01 in integers
ranges = {}; arange = []; prev=0; index=0; j=1
for i in date_ints: # iterate through date integers
if i+1 == date_ints[index] + 1 and i - 1 == prev: # check and compare if integers are in sequence
arange.append(dates[index].strftime("%Y-%m-%d"))
elif prev == 0: # append first date to 'arange' list since 'prev' has not been updated
arange.append(dates[index].strftime("%Y-%m-%d"))
else:
ranges.update({f'Range{j}': tuple(arange)}) # integer are no longer in sequence, update dictionary with new range
arange = []; j += 1 # clear 'arange' and start appending to new range
arange.append(dates[index].strftime("%Y-%m-%d"))
index += 1; prev = i
ranges.update({f'Range{j}': tuple(arange)})
print(ranges)
print(ranges['Range1']) # access a range based on the associated key
print(ranges['Range2'])
outputs:
{'Range1': ('2020-01-01', '2020-01-02', '2020-01-03'), 'Range2': ('2020-01-06', '2020-01-07', '2020-01-08')}
('2020-01-01', '2020-01-02', '2020-01-03')
('2020-01-06', '2020-01-07', '2020-01-08')
来源:https://stackoverflow.com/questions/59774541/split-a-list-of-dates-into-subsets-of-consecutive-dates