可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm trying to learn a bit of Julia, after reading the manual for several hours, I wrote the following piece of code:
ie = 200; ez = zeros(ie + 1); hy = zeros(ie); fdtd1d (steps)= for n in 1:steps for i in 2:ie ez[i]+= (hy[i] - hy[i-1]) end ez[1]= sin(n/10) for i in 1:ie hy[i]+= (ez[i+1]- ez[i]) end end @time fdtd1d(10000);
elapsed time: 2.283153795 seconds (239659044 bytes allocated)
I believe it's under optimizing, because it's much slower than the corresponding Mathematica version:
ie = 200; ez = ConstantArray[0., {ie + 1}]; hy = ConstantArray[0., {ie}]; fdtd1d = Compile[{{steps}}, Module[{ie = ie, ez = ez, hy = hy}, Do[ez[[2 ;; ie]] += (hy[[2 ;; ie]] - hy[[1 ;; ie - 1]]); ez[[1]] = Sin[n/10]; hy[[1 ;; ie]] += (ez[[2 ;; ie + 1]] - ez[[1 ;; ie]]), {n, steps}]; Sow@ez; Sow@hy]]; result = fdtd1d[10000]; // AbsoluteTiming
{0.1280000, Null}
So, how to make the Julia version of fdtd1d
faster?
回答1:
Two things:
The first time you run the function the time will include the compile time of the code. If you want a apples to apples comparison with a compiled function in Mathematica you should run the function twice and time the second run. With your code I get:
elapsed time: 1.156531976 seconds (447764964 bytes allocated)
for the first run which includes the compile time and
elapsed time: 1.135681299 seconds (447520048 bytes allocated)
for the second run when you don't need to compile.
The second thing, and arguably the bigger thing, is that you should avoid global variables in performance critical code. This is the first tip in the performance tips section of the manual.
Here is the same code using local variables:
function fdtd1d_local(steps, ie = 200) ez = zeros(ie + 1); hy = zeros(ie); for n in 1:steps for i in 2:ie ez[i]+= (hy[i] - hy[i-1]) end ez[1]= sin(n/10) for i in 1:ie hy[i]+= (ez[i+1]- ez[i]) end end return (ez, hy) end fdtd1d_local(10000) @time fdtd1d_local(10000);
To compare your Mathematica code on my machine gives
{0.094005, Null}
while the result from @time
for fdtd1d_local
is:
elapsed time: 0.015188926 seconds (4176 bytes allocated)
Or about 6 times faster. Global variables make a big difference.
回答2:
I believe in using limited number of loops and use loops only when required. Expressions can be used in place of loops. It is not possible to avoid all the loops, but the code would be optimized if we reduce some of them. In the above program I did a bit of optimization by using expressions. The time was almost reduced by half.
ORIGINAL CODE :
ie = 200; ez = zeros(ie + 1); hy = zeros(ie); fdtd1d (steps)= for n in 1:steps for i in 2:ie ez[i]+= (hy[i] - hy[i-1]) end ez[1]= sin(n/10) for i in 1:ie hy[i]+= (ez[i+1]- ez[i]) end end @time fdtd1d(10000);
The output is
julia> elapsed time: 1.845615295 seconds (239687888 bytes allocated)
OPTIMIZED CODE:
ie = 200; ez = zeros(ie + 1); hy = zeros(ie); fdtd1d (steps)= for n in 1:steps ez[2:ie] = ez[2:ie]+hy[2:ie]-hy[1:ie-1]; ez[1]= sin(n/10); hy[1:ie] = hy[1:ie]+ez[2:end]- ez[1:end-1] end @time fdtd1d(10000);
OUTPUT
julia> elapsed time: 0.93926323 seconds (206977748 bytes allocated)