Question: I would like to index into an array without triggering memory allocation, especially when passing the indexed elements into a function. From readi
Just use xs = sub(x, 1:N)
. Note that this is different from x = sub(x, [1:N])
; on julia 0.3 the latter will fail, and on julia 0.4-pre the latter will be considerably slower than the former. On julia 0.4-pre, sub(x, 1:N)
is just as fast as view
:
julia> N = 10000000;
julia> x = randn(N);
julia> xs = sub(x, 1:N);
julia> using ArrayViews
julia> xv = view(x, 1:N);
julia> mean(x)
-0.0002491126429772525
julia> mean(xs)
-0.0002491126429772525
julia> mean(xv)
-0.0002491126429772525
julia> @time mean(x);
elapsed time: 0.015345806 seconds (27 kB allocated)
julia> @time mean(xs);
elapsed time: 0.013815785 seconds (96 bytes allocated)
julia> @time mean(xv);
elapsed time: 0.015871052 seconds (96 bytes allocated)
There are several reasons why sub(x, inds)
is slower than sub(x, 1:N)
:
xs[i]
corresponds to x[inds[i]]
; we have to look up two memory locations rather than oneinds
is not in order, you will get poor cache behavior in accessing the elements of x
In this case, the latter is probably the most important effect. This is not a Julia limitation; the same thing would happen were you to write the equivalent code in C, Fortran, or assembly.
Note that it's still faster to say sum(sub(x, inds))
than sum(x[inds])
, (until the latter becomes the former, which it should by the time julia 0.4 is officially out). But if you have to do many operations with xs = sub(x, inds)
, in some circumstances it will be worth your while to make a copy, even though it allocates memory, just so you can take advantage of the optimizations possible when values are stored in contiguous memory.
EDIT: Read tholy's answer too to get a full picture!
When using an array of indices, the situation is not great right now on Julia 0.4-pre (start of Feb 2015):
julia> N = 10000000;
julia> x = randn(N);
julia> inds = [1:N];
julia> @time mean(x)
elapsed time: 0.010702729 seconds (96 bytes allocated)
elapsed time: 0.012167155 seconds (96 bytes allocated)
julia> @time mean(x[inds])
elapsed time: 0.088312275 seconds (76 MB allocated, 17.87% gc time in 1 pauses with 0 full sweep)
elapsed time: 0.073672734 seconds (76 MB allocated, 3.27% gc time in 1 pauses with 0 full sweep)
elapsed time: 0.071646757 seconds (76 MB allocated, 1.08% gc time in 1 pauses with 0 full sweep)
julia> xs = sub(x,inds); # Only works on 0.4
julia> @time mean(xs)
elapsed time: 0.057446177 seconds (96 bytes allocated)
elapsed time: 0.096983673 seconds (96 bytes allocated)
elapsed time: 0.096711312 seconds (96 bytes allocated)
julia> using ArrayViews
julia> xv = view(x, 1:N) # Note use of a range, not [1:N]!
julia> @time mean(xv)
elapsed time: 0.012919509 seconds (96 bytes allocated)
elapsed time: 0.013010655 seconds (96 bytes allocated)
elapsed time: 0.01288134 seconds (96 bytes allocated)
julia> xs = sub(x,1:N) # Works on 0.3 and 0.4
julia> @time mean(xs)
elapsed time: 0.014191482 seconds (96 bytes allocated)
elapsed time: 0.014023089 seconds (96 bytes allocated)
elapsed time: 0.01257188 seconds (96 bytes allocated)
sub
for this on 0.3, but you can on 0.4.mean
.I noticed that with a smaller number of indices used (instead of the whole range), the gap is much smaller, and the memory allocation is low, so sub
might be worth:
N = 100000000
x = randn(N)
inds = [1:div(N,10)]
@time mean(x)
@time mean(x)
@time mean(x)
@time mean(x[inds])
@time mean(x[inds])
@time mean(x[inds])
xi = sub(x,inds)
@time mean(xi)
@time mean(xi)
@time mean(xi)
gives
elapsed time: 0.092831612 seconds (985 kB allocated)
elapsed time: 0.067694917 seconds (96 bytes allocated)
elapsed time: 0.066209038 seconds (96 bytes allocated)
elapsed time: 0.066816927 seconds (76 MB allocated, 20.62% gc time in 1 pauses with 1 full sweep)
elapsed time: 0.057211528 seconds (76 MB allocated, 19.57% gc time in 1 pauses with 0 full sweep)
elapsed time: 0.046782848 seconds (76 MB allocated, 1.81% gc time in 1 pauses with 0 full sweep)
elapsed time: 0.186084807 seconds (4 MB allocated)
elapsed time: 0.057476269 seconds (96 bytes allocated)
elapsed time: 0.05733602 seconds (96 bytes allocated)