numpy calculate polynom efficiently

问题

I'm trying to evaluate polynomial (3'd degree) using numpy. I found that doing it by simpler python code will be much more efficient.

import numpy as np
import timeit

m = [3,7,1,2]

f = lambda m,x: m[0]*x**3 + m[1]*x**2 + m[2]*x + m[3]
np_poly = np.poly1d(m)
np_polyval = lambda m,x: np.polyval(m,x)
np_pow = lambda m,x: np.power(x,[3,2,1,0]).dot(m)

print 'result={}, timeit={}'.format(f(m,12),timeit.Timer('f(m,12)', 'from __main__   import f,m').timeit(10000))
result=6206, timeit=0.0036780834198

print 'result={}, timeit={}'.format(np_poly(12),timeit.Timer('np_poly(12)', 'from __main__ import np_poly').timeit(10000))
result=6206, timeit=0.180546045303

print 'result={}, timeit={}'.format(np_polyval(m,12),timeit.Timer('np_polyval(m,12)', 'from __main__ import np_polyval,m').timeit(10000))
result=6206, timeit=0.227771043777

print 'result={}, timeit={}'.format(np_pow(m,12),timeit.Timer('np_pow(m,12)', 'from __main__ import np_pow,m').timeit(10000))
result=6206, timeit=0.168987989426

Did I miss something?

Is there another way in numpy to evaluate a polynomial?

回答1:

Well, looking at the implementation of polyval (which is the function eventually being called when you eval a poly1d), it seems weird the implementor decided to include an explicit loop... From the source of numpy 1.6.2:

def polyval(p, x):
    p = NX.asarray(p)
    if isinstance(x, poly1d):
        y = 0
    else:
        x = NX.asarray(x)
        y = NX.zeros_like(x)
    for i in range(len(p)):
        y = x * y + p[i]
    return y

On one hand, avoiding the power operation should be advantageous speed-wise, on the other hand, the python-level loop pretty much screws things up.

Here's an alternative numpy-ish implemenation:

POW = np.arange(100)[::-1]
def g(m, x):
    return np.dot(m, x ** POW[m.size : ])

For speed, I avoid recreating the power array on each call. Also, to be fair when benchmarking against numpy, you should start with numpy arrays, not lists, to avoid the penalty of converting the list to numpy on each call.

So, when adding m = np.array(m), my g above only runs about 50% slower than your f.

Despite being slower on the example you posted, for evaluating a low-degree polynomial on a scalar x, you really can't do much faster than an explict implemenation (like your f) (of course you can, but probably not by much without resorting to writing lower-level code). However, for higher degrees (where you have to replace you explict expression with some sort of a loop), the numpy approach (e.g. g) would prove much faster as the degree increases, and also for vectorized evaluation, i.e. when x is a vector.

回答2:

Something like 23 years ago I checked out a copy of Press et al Numerical Recipes in C from the university's library. There was a lot of cool stuff in that book, but there's a passage that has stuck with me over the years, page 173 here:

We assume that you know enough never to evaluate a polynomial this way:
    p=c[0]+c[1]*x+c[2]*x*x+c[3]*x*x*x+c[4]*x*x*x*x;
or (even worse!),
    p=c[0]+c[1]*x+c[2]*pow(x,2.0)+c[3]*pow(x,3.0)+c[4]*pow(x,4.0);
Come the (computer) revolution, all persons found guilty of such criminal behavior will be summarily executed, and their programs won't be! It is a matter of taste, however, whether to write
    p = c[0]+x*(c[1]+x*(c[2]+x*(c[3]+x*c[4])));
or
    p = (((c[4]*x+c[3])*x+c[2])*x+c[1])*x+c[0];

So if you are really worried about performance, you want to try that, the differences will be huge for higher degree polynomials:

In [24]: fast_f = lambda m, x: m[3] + x*(m[1] + x*(m[2] + x*m[3]))

In [25]: %timeit f(m, 12)
1000000 loops, best of 3: 478 ns per loop

In [26]: %timeit fast_f(m, 12)
1000000 loops, best of 3: 374 ns per loop

If you want to stick with numpy, there is a newer polynomial class that runs 2x faster than poly1d on my system, but is still much slower than the previous loops:

In [27]: np_fast_poly = np.polynomial.polynomial.Polynomial(m[::-1])

In [28]: %timeit np_poly(12)
100000 loops, best of 3: 15.4 us per loop

In [29]: %timeit np_fast_poly(12)
100000 loops, best of 3: 8.01 us per loop

来源：https://stackoverflow.com/questions/24065904/numpy-calculate-polynom-efficiently

标签

python

numpy

performance

polynomials