Recurrence with numpy

前端 未结 4 904
广开言路
广开言路 2020-12-20 03:14

Is there any way to make recurrence without using for\'s in numpy?

Using np.add with out keyword do the trick with dtype=\"int\"

相关标签:
4条回答
  • 2020-12-20 03:32

    Since linear recurrences have analytic solutions (here for Fibonacci), an other rapid scipy method is :

    def fib_scipy2(N):
        r5=math.sqrt(5)
        phi=(1+r5)/2
        a= (-phi*ones(N)).cumprod()
        return (np.abs(a)-1/a)/r5
    

    Runs :

    In [413]: fib_scipy2(8)
    Out[413]: array([  1.,   1.,   2.,   3.,   5.,   8.,  13.,  21.])
    In [414]: %timeit fib(10**4)
    103 µs ± 888 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    

    You can tune it to any linear equation.

    0 讨论(0)
  • 2020-12-20 03:34

    Here is a reasonably efficient method using a scipy filter:

    import numpy as np
    from scipy import signal
    
    def scipy_fib(n):
        x = np.zeros(n, dtype=np.uint8)
        x[0] = 1
    
        sos = signal.tf2sos([1], [1, -1, -1])
        return signal.sosfilt(sos, x)
    

    (Note that we can't use lfilter because in terms of signal processing this is an unstable filter which makes the Python interpreter crash, so we have to convert to second-order sections form.)

    The problem with the filtering approach is that I'm not sure if it can be generalized to the actual problem of solving an ODE.

    Another solution is to simply write the loop in Python and accelerate it with a JIT compiler:

    import numba as nb    
    
    @nb.jit(nopython=True)
    def numba_fib(n):
        y = np.empty(n)
        y[:2] = 1
    
        for i in range(2, n):
            y[i] = y[i-1] + y[i-2]
    
        return y
    

    The advantage of the JIT approach is that you can implement all kinds of fancy stuff, but it works best if you stick to simple loops and branches without calling any (non-JITted) Python functions.

    Timing results:

    # Baseline without JIT:
    %timeit numba_fib(10000)  # 100 loops, best of 3: 5.47 ms per loop
    
    %timeit scipy_fib(10000)  # 1000 loops, best of 3: 719 µs per loop
    %timeit numba_fib(10000)  # 10000 loops, best of 3: 33.8 µs per loop
    
    # For fun, this is how Paul Panzer's solve_banded approach compares:
    %timeit banded_fib(10000)  # 1000 loops, best of 3: 1.33 ms per loop
    

    The scipy filter approach is faster than the pure Python loop but slower than the JITted loop. I guess the filter function is relatively generic and does stuff we don't need here, while the JITted loop compiles down to a very small loop.

    0 讨论(0)
  • 2020-12-20 03:45

    One maybe not super-efficent but fun solution would be to abuse scipy.linalg.solve_banded like so

    import numpy as np
    from scipy import linalg
    
    N = 50
    a = np.zeros((N,)) + np.array([[1, -1, -1]]).T
    b = np.zeros((N,))
    b[0] = 1
    linalg.solve_banded((2, 0), a, b)
    # array([  1.00000000e+00,   1.00000000e+00,   2.00000000e+00,
    #          3.00000000e+00,   5.00000000e+00,   8.00000000e+00,
    #          1.30000000e+01,   2.10000000e+01,   3.40000000e+01,
    #          5.50000000e+01,   8.90000000e+01,   1.44000000e+02,
    #          2.33000000e+02,   3.77000000e+02,   6.10000000e+02,
    #          9.87000000e+02,   1.59700000e+03,   2.58400000e+03,
    #          4.18100000e+03,   6.76500000e+03,   1.09460000e+04,
    #          1.77110000e+04,   2.86570000e+04,   4.63680000e+04,
    #          7.50250000e+04,   1.21393000e+05,   1.96418000e+05,
    #          3.17811000e+05,   5.14229000e+05,   8.32040000e+05,
    #          1.34626900e+06,   2.17830900e+06,   3.52457800e+06,
    #          5.70288700e+06,   9.22746500e+06,   1.49303520e+07,
    #          2.41578170e+07,   3.90881690e+07,   6.32459860e+07,
    #          1.02334155e+08,   1.65580141e+08,   2.67914296e+08,
    #          4.33494437e+08,   7.01408733e+08,   1.13490317e+09,
    #          1.83631190e+09,   2.97121507e+09,   4.80752698e+09,
    #          7.77874205e+09,   1.25862690e+10])
    

    How this works is that we write F_0, F_1, F_2... as one vector F and the identities -F_{i-1} -F_i + F{i+1} = 0 as a matrix which is nicely banded and then solve for F. Note that this can adapted to other similar recurrences.

    0 讨论(0)
  • 2020-12-20 03:50

    Your add works for this calculation, but has to be applied repeatedly, so that the nonzero values propagate. I don't see how your calculation generated [ 1. 1. 2. 1. 3. 1. 4. 1. 5. 1.].

    In [619]: fib=np.zeros(10,int);fib[:2]=1
    In [620]: for _ in range(10):
         ...:     np.add(fib[:-2], fib[1:-1], out=fib[2:])
         ...:     print(fib)
         ...: 
    [1 1 2 1 0 0 0 0 0 0]   # **
    [1 1 2 3 3 1 0 0 0 0]
    [1 1 2 3 5 6 4 1 0 0]
    [ 1  1  2  3  5  8 11 10  5  1]
    [ 1  1  2  3  5  8 13 19 21 15]
     ...
    

    (edit - Note that the first np.add acts as though it is fully buffered. Compare the result at ** with both of your object and float arrays. I'm using version 1.13.1 on Py3.)

    Or to highlight the good values at each stage

    In [623]: fib=np.zeros(20,int);fib[:2]=1
    In [624]: for i in range(10):
         ...:     np.add(fib[:-2], fib[1:-1], out=fib[2:])
         ...:     print(fib[:(i+2)])
         ...: 
    [1 1]
    [1 1 2]
    [1 1 2 3]
    [1 1 2 3 5]
    [1 1 2 3 5 8]
    [ 1  1  2  3  5  8 13]
    [ 1  1  2  3  5  8 13 21]
    [ 1  1  2  3  5  8 13 21 34]
    [ 1  1  2  3  5  8 13 21 34 55]
    [ 1  1  2  3  5  8 13 21 34 55 89]
    

    fib[2:] = fib[:-2]+fib[1:-1] does the same thing.

    As discussed in the documentation for ufunc.at, operations like np.add use buffering, even with the out parameter. So while they do interate in C level code, they don't accumulate; that is, results from the ith step aren't used at the i+1 step.

    add.at can be used to perform unbuffered a[i] += b. That's handy when the indexes contain dupicates. but it doesn't allow for feedback from the changed a values to b. So it isn't useful here.

    The is also a add.accumulate (aka np.cumsum) which can be handy for certain iterative definitions. But it's hard to apply in general cases.

    cumprod (multiply accumulate) can work with the qmatrix approach

    Numpy's matrix_power function giving wrong results for large exponents

    Use np.matrix to define the qmatrix, so that * multiply means matrix product (as opposed to elementwise):

    In [647]: qmatrix = numpy.matrix([[1, 1], [1, 0]])
    

    Populate an object matrix with copies (pointers actually) to this matrix.

    In [648]: M = np.empty(10, object)
    In [649]: M[:] = [qmatrix for _ in range(10)]
    

    Apply cumprod, and extract one element from each matrix.

    In [650]: m1=np.cumprod(M)
    In [651]: m1
    Out[651]: 
    array([matrix([[1, 1],
            [1, 0]]), matrix([[2, 1],
            [1, 1]]),
           matrix([[3, 2],
            [2, 1]]), matrix([[5, 3],
            [3, 2]]),
           matrix([[8, 5],
            [5, 3]]),
           matrix([[13,  8],
            [ 8,  5]]),
           matrix([[21, 13],
            [13,  8]]),
           matrix([[34, 21],
            [21, 13]]),
           matrix([[55, 34],
            [34, 21]]),
           matrix([[89, 55],
            [55, 34]])], dtype=object)
    In [652]: [x[0,1] for x in m1]
    Out[652]: [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
    

    The linked answer uses numpy.linalg.matrix_power which also does repeated matrix products. For a single power, matrix_power is faster, but for a whole range of powers, the cumprod is faster.

    0 讨论(0)
提交回复
热议问题