问题
I have a symbolic array that can be expressed as:
from sympy import lambdify, Matrix
g_sympy = Matrix([[ x, 2*x, 3*x, 4*x, 5*x, 6*x, 7*x, 8*x, 9*x, 10*x],
[x**2, x**3, x**4, x**5, x**6, x**7, x**8, x**9, x**10, x**11]])
g = lambdify( (x), g_sympy )
So that for each x
I get a different matrix:
g(1.) # matrix([[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.],
# [ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])
g(2.) # matrix([[ 2.00e+00, 4.00e+00, 6.00e+00, 8.00e+00, 1.00e+01, 1.20e+01, 1.40e+01, 1.60e+01, 1.80e+01, 2.00e+01],
# [ 4.00e+00, 8.00e+00, 1.60e+01, 3.20e+01, 6.40e+01, 1.28e+02, 2.56e+02, 5.12e+02, 1.02e+03, 2.05e+03]])
and so on...
I need to numerically integrate g
over x
, say from 0. to 100.
(in the real case the integral does not have an exact solution) and in my current approach I have to lambdify
each element in g
and integrate it individually. I am using quad
to do an element-wise integration like:
ans = np.zeros( g_sympy.shape )
for (i,j), func_sympy in ndenumerate(g_sympy):
func = lambdify( (x), func_sympy)
ans[i,j] = quad( func, 0., 100. )
There are two problems here: 1) lambdify used many times and 2) for loop; and I believe the first one is the bottleneck, because the g_sympy
matrix has at most 10000 terms (which is not a big deal to a for loop).
As shown above lambdify
allows the evaluation of the whole matrix, so I thought: "Is there a way to integrate the whole matrix?"
scipy.integrate.quadrature
has a parameter vec_func
which gave me hope. I was expecting something like:
g_int = quadrature( g, x1, x2 )
to get the fully integrated matrix, but it gives the ValueError:
matrix must be 2-dimensional
EDIT: What I am trying to do can apparently be done in Matlab using quadv
and has already been discussed for SciPy
The real case has been made available here.
To run it you will need:
- numpy
- scipy
- matplotlib
- sympy
Just run: "python curved_beam_mrs.py"
.
You will see that the procedure is already slow, mainly because of the integration, indicated by the TODO
in file curved_beam.py
.
It will go much slower if you remove the comment indicated after the TODO
in file curved_beam_mrs.py
.
The matrix of functions which is integrated is showed in the print.txt
file.
Thank you!
回答1:
The first argument to either quad
or quadrature
must be a callable. The vec_func
argument of the quadrature
refers to whether the argument of this callable is a (possibly multidimensional) vector. Technically, you can vectorize
the quad
itself:
>>> from math import sin, cos, pi
>>> from scipy.integrate import quad
>>> from numpy import vectorize
>>> a = [sin, cos]
>>> vectorize(quad)(a, 0, pi)
(array([ 2.00000000e+00, 4.92255263e-17]), array([ 2.22044605e-14, 2.21022394e-14]))
But that's just equivalent to explicit looping over the elements of a
. Specifically, it'll not give you any performance gains, if that's what you're after. So, all in all, the question is why and what exactly you are trying to achieve here.
回答2:
Vectorizing trapezoidal and simpson's integral rules. Trapezoidal is just copy and pasted from another project that used logspace instead of linspace so that it can utilize non-uniform grids.
def trap(func,a,b,num):
xlinear=np.linspace(a,b,num)
slicevol=np.diff(xlinear)
output=integrand(xlinear)
output=output[:,:-1]+output[:,1:]
return np.dot(output,slicevol)/2
def simpson(func,a,b,num):
a=float(a)
b=float(b)
h=(b-a)/num
output=4*np.sum(integrand(a+h*np.arange(1,num,2)),axis=1)
output+=2*np.sum(integrand(a+h*np.arange(2,num-1,2)),axis=1)
output+=np.sum(integrand(b),axis=1)
output+=np.sum(integrand(a),axis=1)
return output*h/3
def integrand(rlin):
first=np.arange(1,11)[:,None]
second=np.arange(2,12)[:,None]
return np.vstack((rlin*first,np.power(rlin,second)))
Examine trapazoidal and simpsons rule cumulative relative errors:
b=float(100)
first=np.arange(1,11)*(b**2)/2
second=np.power(b,np.arange(3,13))/np.arange(3,13)
exact=np.vstack((first,second))
for x in range(3):
num=x*100+100
tr=trap(integrand,0,b,num).reshape(2,-1)
si=simpson(integrand,0,b,num).reshape(2,-1)
rel_trap=np.sum(abs((tr-exact)/exact))*100
rel_simp=np.sum(abs((si-exact)/exact))*100
print 'Number of points:',num,'Trap Rel',round(rel_trap,6),'Simp Rel',round(rel_simp,6)
Number of points: 100 Trap Rel 0.4846 Simp Rel 0.000171
Number of points: 200 Trap Rel 0.119944 Simp Rel 1.1e-05
Number of points: 300 Trap Rel 0.053131 Simp Rel 2e-06
Timeit. Note that both trapezoidal rules use 200 points while simpsons is timed at only 100 based on the above convergence. Sorry I dont have sympy:
s="""
import numpy as np
from scipy.integrate import trapz
def integrand(rlin):
first=np.arange(1,11)[:,None]
second=np.arange(2,12)[:,None]
return np.vstack((rlin*first,np.power(rlin,second)))
def trap(func,a,b,num):
xlinear=np.linspace(a,b,num)
slicevol=np.diff(xlinear)
output=integrand(xlinear)
output=output[:,:-1]+output[:,1:]
return np.dot(output,slicevol)/2
def simpson(func,a,b,num):
a=float(a)
b=float(b)
h=(b-a)/num
output=4*np.sum(integrand(a+h*np.arange(1,num,2)),axis=1)
output+=2*np.sum(integrand(a+h*np.arange(2,num-1,2)),axis=1)
output+=np.sum(integrand(b),axis=1)
output+=np.sum(integrand(a),axis=1)
return output*h/3
def simpson2(func,a,b,num):
a=float(a)
b=float(b)
h=(b-a)/num
p1=a+h*np.arange(1,num,2)
p2=a+h*np.arange(2,num-1,2)
points=np.hstack((p1,p2,a,b))
mult=np.hstack((np.repeat(4,p1.shape[0]),np.repeat(2,p2.shape[0]),1,1))
return np.dot(integrand(points),mult)*h/3
def x2(x):
return x**2
def x3(x):
return x**3
def x4(x):
return x**4
def x5(x):
return x**5
def x5(x):
return x**5
def x6(x):
return x**6
def x7(x):
return x**7
def x8(x):
return x**8
def x9(x):
return x**9
def x10(x):
return x**10
def x11(x):
return x**11
def xt5(x):
return 5*x
"""
zhenya="""
a=[xt5,xt5,xt5,xt5,xt5,xt5,xt5,xt5,xt5,xt5,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11]
vectorize(quad)(a, 0, 100)
"""
usethedeathstar="""
g=lambda x: np.array([[x,2*x,3*x,4*x,5*x,6*x,7*x,8*x,9*x,10*x],[x**2,x**3,x**4,x**5,x**6,x**7,x**8,x**9,x**10,x**11]])
xv=np.linspace(0,100,200)
trapz(g(xv))
"""
vectrap="""
trap(integrand,0,100,200)
"""
vecsimp="""
simpson(integrand,0,100,100)
"""
vecsimp2="""
simpson2(integrand,0,100,100)
"""
print 'zhenya took',timeit.timeit(zhenya,setup=s,number=100),'seconds.'
print 'usethedeathstar took',timeit.timeit(usethedeathstar,setup=s,number=100),'seconds.'
print 'vectrap took',timeit.timeit(vectrap,setup=s,number=100),'seconds.'
print 'vecsimp took',timeit.timeit(vecsimp,setup=s,number=100),'seconds.'
print 'vecsimp2 took',timeit.timeit(vecsimp2,setup=s,number=100),'seconds.'
Results:
zhenya took 0.0500509738922 seconds.
usethedeathstar took 0.109386920929 seconds.
vectrap took 0.041011095047 seconds.
vecsimp took 0.0376999378204 seconds.
vecsimp2 took 0.0311458110809 seconds.
Something to point out in the timings is zhenya's answer should be much more accurate. I believe everything is correct, please let me know if changes are required.
If you provide the functions and range that you will be using I can probably whip up something better for your system. Also would you be interested in utilizing additional cores/nodes?
回答3:
In the real case the integral does not have an exact solution, do you mean singularities? Could you be more precise on it, as well as on the size of the matrix that you wish to integrate. I have to admit that sympy is dreadfully slow when it comes to some things (not sure if integration is part of it, but i prefer to stay away from sympy and stick to numpy solution). Do you want to get a more elegant solution, by doing it with a matrix or a faster one?
-note: apparently i cant add comment to your post to ask for this stuff, so i had to post this as answer, maybe this is because i dont have enough reputation or so?-
edit: something like this?
import numpy
from scipy.integrate import trapz
g=lambda x: numpy.array([[x,2*x,3*x],[x**2,x**3,x**4]])
xv=numpy.linspace(0,100,200)
print trapz(g(xv))
having seen that you want to integrate stuff like sum(a*sin(bx+c)^n*cos(dx+e)^m), for different coefficients for the a,b,c,d,e,m,n, i suggest doing all of those analytically. (should have some formula for that since you can just rewrite sin to complex exponentials
Another thing i noted when checking those functions a bit better, is that sin(a*x+pi/2) and sin(a*x+pi) and stuff like that can be rewritten to cos or sin in a way that removes the pi/2 or pi. Also what i see is that just by looking at the first element in your matrix of functions:
a*sin(bx+c)^2+d*cos(bx+c)^2 = a*(sin^2+cos^2)+(d-a)*cos(bx+c)^2 = a+(d-a)*cos(bx+c)^2
which also simplifies the calculations. If you had the formulas in a way which didnt involve a massive txtfile or so, id check what the most general formula is that you need to integrate, but i guess its something like a*sin^n(bx+c)*cos^m(dx+e), with m and n being 0 1 or 2, and those things can be simplified into something which can be analytically integrated. So if you find out the most general analytical function you got, you can easily make something like
f=lambda x: [[s1(x),s2(x)],[s3(x),s4(x)]]
res=f(x2)-f(x1)
where s1(x) etc are just the analytically integrated versions of your functions?
(not really planning on going through your entire code to see what all the rest does, but is it just integrating those functions in the txt file from a to b or something like that? or is there somewhere something like that you take the square of each function or whatever thing that might mess up the possibility of doing it analytically?)
this should simplify your integrals i guess?
first integral and: second one
hmm, that second link doesnt work, but you get the idea from the first one i guess
edit, since you do not want analytical solutions: the improvement remains in getting rid of sympy:
from sympy import sin as SIN
from numpy import sin as SIN2
from scipy.integrate import trapz
import time
import numpy as np
def integrand(rlin):
first=np.arange(1,11)[:,None]
second=np.arange(2,12)[:,None]
return np.vstack((rlin*first,np.power(rlin,second)))
def simpson2(func,a,b,num):
a=float(a)
b=float(b)
h=(b-a)/num
p1=a+h*np.arange(1,num,2)
p2=a+h*np.arange(2,num-1,2)
points=np.hstack((p1,p2,a,b))
mult=np.hstack((np.repeat(4,p1.shape[0]),np.repeat(2,p2.shape[0]),1,1))
return np.dot(integrand(points),mult)*h/3
A=np.linspace(0,100.,200)
B=lambda x: SIN(x)
C=lambda x: SIN2(x)
t0=time.time()
D=simpson2(B,0,100.,200)
print time.time()-t0
t1=time.time()
E=trapz(C(A))
print time.time()-t1
t2=time.time()
F=simpson2(C,0,100.,200)
print time.time()-t2
results in:
0.000764131546021 sec for the faster method, but when using sympy
7.58171081543e-05 sec for my slower method, but which uses numpy
0.000519037246704 sec for the faster method, when using numpy,
conclusion: use numpy, ditch sympy, (my slower numpy method is actually faster in this case, because in this example i only tried it on one sin-function, instead of on a ndarray of them, but the point of ditching sympy still remains when comparing the time of the numpy version of the faster method to the one of the sympy version of the faster method)
回答4:
I might have found some interesting way of doing this, at the expense of defining different symbols for the matrix g_symp
:
import numpy as np
from scipy.integrate import quad
import sympy as sy
@np.vectorize
def vec_lambdify(var, expr, *args, **kw):
return sy.lambdify(var, expr, *args, **kw)
@np.vectorize
def vec_quad(f, a, b, *args, **kw):
return quad(f, a, b, *args, **kw)[0]
Y = sy.symbols("y1:11")
x = sy.symbols("x")
mul_x = [y.subs(y,x*(i+1)) for (i,y) in enumerate(Y)]
pow_x = [y.subs(y,x**(i+1)) for (i,y) in enumerate(Y)]
g_sympy = np.array(mul_x + pow_x).reshape((2,10))
X = x*np.ones_like(g_sympy)
G = vec_lambdify(X, g_sympy)
I = vec_quad(G, 0, 100)
print(I)
with results:
[[ 5.00000000e+03 1.00000000e+04 1.50000000e+04 2.00000000e+04
2.50000000e+04 3.00000000e+04 3.50000000e+04 4.00000000e+04
4.50000000e+04 5.00000000e+04]
[ 5.00000000e+03 3.33333333e+05 2.50000000e+07 2.00000000e+09
1.66666667e+11 1.42857143e+13 1.25000000e+15 1.11111111e+17
1.00000000e+19 9.09090909e+20]]
and using the ipython magic%timeit vec_quad(G,0,100)
I got
1000 loops, best of 3: 527 µs per loop
I think this approach is somewhat more clean, despite the juggling with symbols.
回答5:
quadpy (a project of mine) does vectorized quadrature. This
import numpy
import quadpy
def f(x):
return [
[k * x for k in range(2, 11)],
[x ** k for k in range(2, 11)],
]
sol, err = quadpy.quad(f, 0.0, 100.0, epsabs=numpy.inf, epsrel=1.0e-10)
print(sol)
print(err)
gives
[[1.00000000e+04 1.50000000e+04 2.00000000e+04 2.50000000e+04
3.00000000e+04 3.50000000e+04 4.00000000e+04 4.50000000e+04
5.00000000e+04]
[3.33333333e+05 2.50000000e+07 2.00000000e+09 1.66666667e+11
1.42857143e+13 1.25000000e+15 1.11111111e+17 1.00000000e+19
9.09090909e+20]]
[[5.11783704e-16 4.17869644e-16 1.02356741e-15 9.15506521e-16
8.35739289e-16 1.19125717e-15 2.04713482e-15 1.93005721e-15
1.83101304e-15]
[6.69117036e-14 9.26814751e-12 1.05290634e-09 1.12081237e-07
1.09966583e-05 1.09356156e-03 1.00722052e-01 9.31052614e+00
9.09545305e+02]]
Timings:
%timeit quadpy.quad(f, 0.0, 100.0, epsabs=numpy.inf, epsrel=1.0e-10)
904 µs ± 3.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
来源:https://stackoverflow.com/questions/17070333/using-scipy-to-integrate-a-function-that-returns-a-matrix-or-array