Find the longest arithmetic progression inside a sequence

一笑奈何 提交于 2021-02-10 05:07:37


Suppose I have a sequence of increasing numbers, and I want to find the length of longest arithmetic progression within the sequence. Longest arithmetic progression means an increasing sequence with common difference, such as [2, 4, 6, 8] or [3, 6, 9, 12].

For example, for [5, 10, 14, 15, 17], [5, 10, 15] is the longest arithmetic progression, with length 3;

for [10, 12, 13, 20, 22, 23, 30], [10, 20, 30] is the longest arithmetic progression with length 3;

for [7, 10, 12, 13, 15, 20, 21], [10, 15, 20] or [7, 10, 13] are the longest arithmetic progressions with length 3.

This site offers some insight into the problem, i.e. by looping around j and consider every 3 elements. I intend to use this algorithm in Python, and my code is as follows:

def length_of_AP(L):
n = len(L)
Table = [[0 for _ in range(n)] for _ in range(n)]
length_of_AP = 2

# initialise the last column of the table as all i and (n-1) pairs have lenth 2
for i in range(n):
        Table[i][n-1] =2

# loop around the list and i, k such that L[i] + L[k] = 2 * L[j]
for j in range(n - 2, 0, -1):
        i = j - 1
        k = j + 1
        while i >= 0 and k < n:
                difference = (L[i] + L[k]) - 2 * L[j]
                if difference < 0:
                        k = k + 1
                        if difference > 0:
                                i = i - 1
                                Table[i][j] = Table[j][k] + 1
                                length_of_AP = max(length_of_AP, Table[i][j])
                                k = k + 1
                                i = i - 1
return length_of_AP

This function works fine with [1, 3, 4, 5, 7, 8, 9], but it doesn't work for [5, 10, 14, 15, 20, 25, 26, 27, 28, 30, 31], where I am supposed to get 6 but I got 4. I can see the reason being that 25, 26, 27, 28 inside the list may be a distracting factor for my function. How do I change my function so that it gives me the result desired.

Any help may be appreciated.


Following your link and running second sample, it looks like the code actually find proper LAP

5, 10, 15, 20, 25, 30,

but fails to find proper length. I didn't spend too much time analyzing the code but the piece

    // Any 2-letter series is an AP
    // Here we initialize only for the last column of lookup because
    // all i and (n-1) pairs form an AP of size 2  
    for (int i=0; i<n; i++)
        lookup[i][n-1] = 2;

looks suspicious to me. It seems that you need to initialize whole lookup table with 2 instead of just last column and if I do so, it starts to get correct length on your sample as well.

So get rid of the "initialise" loop and change your 3rd line to following code:

# initialise whole table with 2 as all (i, j) pairs have length 2    
Table = [[2 for _ in range(n)] for _ in range(n)]

Moreover their

Sample Execution:
Max AP length = 6
3, 5, 7, 9, 11, 13, 15, 17,

Contains this bug as well and actually prints correct sequence only because of sheer luck. If I modify the sortedArr to

int sortedArr[] = new int[] {3, 4, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17, 18,  112, 113, 114, 115, 116, 117, 118};

I get following output

Max AP length = 7
112, 113, 114, 115, 116, 117, 118,

which is obviously wrong as original 8-items long sequence 3, 5, 7, 9, 11, 13, 15, 17, is still there.


Did you try it?
Here's a quick brute force implementation, for small datasets it should run fast enough:

def gen(seq):
    diff = ((b-a, a) for a, b in it.combinations(sorted(seq), 2))
    for d, n in diff:
        k = []
        while n in seq:
            n += d
        yield (d, k)

def arith(seq):
    return max(gen(seq), key=lambda x: len(x[1]))

In [1]: arith([7, 10, 12, 13, 15, 20, 21])
Out[1]: (3, [7, 10, 13])
In [2]: %timeit arith([7, 10, 12, 13, 15, 20, 21])
10000 loops, best of 3: 23.6 µs per loop
In [3]: seq = {random.randrange(1000) for _ in range(100)}
In [4]: arith(seq)
Out[4]: (171, [229, 400, 571, 742, 913])
In [5]: %timeit arith(seq)
100 loops, best of 3: 3.79 ms per loop
In [6]: seq = {random.randrange(1000000) for _ in range(1000)}
In [7]: arith(seq)
Out[7]: (81261, [821349, 902610, 983871])
In [8]: %timeit arith(seq)
1 loop, best of 3: 434 ms per loop

