What is an efficient way to get the max concurrency in a list of tuples?

问题

I have been trying to solve this problem in an efficient way. The problem is:

Problem Statement

Given a list of tuples in the form [(start1, end1), (start2, end2), (start3, end3)....(startn, endn)] where start and end are positive integers. Each tuple is to represent a time window, for example: [(1, 3), (73, 80)...]. Find the time (integer) where max concurrency occurs and get the tuples where max concurrency occurs.

Constraints:

start and end are integers of time and are between 0 to n
For all cases start < end
start is inclusive but end is exclusive
For the time (integer) where maximum concurrency occurs, we can get only one if there are multiple cases

For example the schedule below will have max_concurrency at time 2 and the tuples are (0,3), (2,3), (1, 200) that have it.

schedule = [
            (0, 3),
            (3, 5),
            (2, 3),
            (6, 8),
            (10, 12),
            (73, 92),
            (1, 200),
            ]

My Code

For time at where maximum concurrency occurs. Correct me if I'm wrong but I think this runs in O(n^2) time.

from collections import defaultdict

schedule_dict = defaultdict(lambda: 0)

for start, end in schedule:
    for time in range(start, end):
            schedule_dict[time] += 1

max_concurrency = max(schedule_dict, key=schedule_dict.get)
print(f"Time where max concurrency happens is : {max_concurrency}")

Output

Time where max concurrency happens is : 2

For the sessions where maximum concurrency occurs, I think this runs in O(n) time.

My Code

for start, end in schedule:
    if start <= max_concurrency < end:
        print(f"{(start, end)}")

Output

(0, 3)
(2, 3)
(1, 200)

Finally My Question

What is a more efficient way to do this to reduce time and space complexity?

回答1:

The number of intervals overlapping any time instant T is the number of interval start times less than or equal to T, minus the number of interval end times less than or equal to T.

Put the start times and end times in separate lists, and sort them.
Initialize a depth counter to 0
Walk through the lists in order (like merge sort), adding 1 for each start time, and subtracting 1 for each end time
Remember when the counter reaches a maximum -- that's the time of maximum overlap.

Here's an implementation in python:

schedule = [
  (0, 3),  (3, 5), (2, 3), (6, 8),
  (10, 12), (73, 92), (1, 200),
  ]

starts = [x[0] for x in schedule]
ends = [x[1] for x in schedule]

starts.sort()
ends.sort()

endpos = 0
depth = 0
maxdepth = 0
maxdepthtime = -1

for time in starts:
    depth+=1
    while endpos < len(ends) and ends[endpos]<= time:
        depth -= 1
        endpos += 1
    if depth > maxdepth:
        maxdepth = depth
        maxdepthtime = time

overlappers = [x for x in schedule
    if (x[0] <= maxdepthtime and x[1] > maxdepthtime)]

print ("Max overlap at time: ", maxdepthtime, " depth ", maxdepth)
print ("Intervals: ", overlappers)

来源：https://stackoverflow.com/questions/60127612/what-is-an-efficient-way-to-get-the-max-concurrency-in-a-list-of-tuples

标签

algorithm

language-agnostic

time-complexity

schedule

space-complexity