问题
I'm looking for some clarification of the bounds checking rules in Julia. Is this meaning that if I put @inbounds
at the beginning of the for loop,
@inbounds for ... end
then only for "one layer" inbounds propagates, so if there is a for loop inside of that, @inbounds
will not turn off the bounds checking in there? And if I use @propagate_inbounds
, it will go inside the nested for loop?
And is it correct to say @inbounds
always wins over @boundscheck
? The only exception if the function is not inlined, but is that just a case of the previous "one layer" rule, so @propagate_inbounds
would turn off the bounds checking even in the non-inlined function call?
回答1:
When the manual speaks about @inbounds
propagating through "one layer," it's specifically referring to function call boundaries. The fact that it's only able to affect functions that get inlined is a secondary requirement that makes this especially confusing and tough to test, so let's not worry about inlining until later.
The @inbounds
macro annotates function calls such that they're able to elide bounds checks. In fact, the macro will do this for all function calls in the expression that is passed to it, including any number of nested for
loops, begin
blocks, if
statements, etc. And, of course, indexing and indexed assignment are simply "sugars" that lower to function calls, so it affects those the same way. All this makes sense; as the author of the code that's wrapped by @inbounds
, you're able to see the macro and ensure that it's safe to do so.
But the @inbounds
macro tells Julia to do something funny. It changes the behavior of code that's written in a totally different place! For example when you annotate the call:
julia> f() = @inbounds return getindex(4:5, 10);
f()
13
The macro effectively reaches into the standard library and disables that @boundscheck
block, allowing it to compute values outside of the range's valid region.
This is a spooky action at a distance… and if it's not carefully constrained, it could end up removing bounds-checks from library code where it's not intended or fully safe to do so. That's why there's the "one-layer" restriction; we only want to remove bounds checks when authors are explicitly aware that it might occur and opt-in to the removal.
Now, as a library author, there may be cases where you want to opt-in to allow @inbounds
to propagate through to all functions you call within the method. That's where Base.@propagate_inbounds
is used. Unlike @inbounds
, which annotates function calls, @propagate_inbounds
annotates method definitions to allow for the inbounds state that the method gets called with to propagate through to all function calls you make in the method's implementation. This is a bit tough to describe in the abstract, so let's look at a concrete example.
An Example
Let's create a toy custom vector that simply creates a shuffled view into the vector it wraps:
julia> module M
using Random
struct ShuffledVector{A,T} <: AbstractVector{T}
data::A
shuffle::Vector{Int}
end
ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
Base.size(A::ShuffledVector) = size(A.data)
Base.@inline function Base.getindex(A::ShuffledVector, i::Int)
A.data[A.shuffle[i]]
end
end
This is pretty straight-forward — we wrap any vector type, create a random permutation, and then upon indexing we just index into the original array using the permutation. And we know that all accesses into the subparts of the array should be okay based upon the outer constructor… so even though we aren't checking bounds ourselves, we can rely upon the inner indexing expressions throwing errors if we index out of bounds.
julia> s = M.ShuffledVector(1:4)
4-element Main.M.ShuffledVector{UnitRange{Int64},Int64}:
1
3
4
2
julia> s[5]
ERROR: BoundsError: attempt to access 4-element Array{Int64,1} at index [5]
Stacktrace:
[1] getindex at ./array.jl:728 [inlined]
[2] getindex(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Int64) at ./REPL[10]:10
[3] top-level scope at REPL[15]:1
Note how the bounds error is coming not from the indexing into the ShuffledVector, but rather from indexing into the permutation vector A.perm[5]
. Now perhaps a user of our ShuffledVector wants its accesses to be faster, so they try turning off bounds-checking with @inbounds
:
julia> f(A, i) = @inbounds return A[i]
f (generic function with 1 method)
julia> f(s, 5)
ERROR: BoundsError: attempt to access 4-element Array{Int64,1} at index [5]
Stacktrace:
[1] getindex at ./array.jl:728 [inlined]
[2] getindex at ./REPL[10]:10 [inlined]
[3] f(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Int64) at ./REPL[16]:1
[4] top-level scope at REPL[17]:1
But they're still getting bounds errors! This is because @inbounds
annotation only tried to remove the @boundscheck
blocks from the method we wrote above. It doesn't propagate through to the standard library to remove the bounds-checking from either the A.perm
array nor the A.data
range. That's quite a bit of overhead, even though they tried to remove bounds! So, we can instead write the above getindex
method with a Base.@propagate_inbounds
annotation which will allow for this method to "inherit" its caller's in-bounds state:
julia> module M
using Random
struct ShuffledVector{A,T} <: AbstractVector{T}
data::A
shuffle::Vector{Int}
end
ShuffledVector(A::AbstractVector{T}) where {T} = ShuffledVector{typeof(A), T}(A, randperm(length(A)))
Base.size(A::ShuffledVector) = size(A.data)
Base.@propagate_inbounds function Base.getindex(A::ShuffledVector, i::Int)
A.data[A.shuffle[i]]
end
end
WARNING: replacing module M.
Main.M
julia> s = M.ShuffledVector(1:4);
julia> s[5]
ERROR: BoundsError: attempt to access 4-element Array{Int64,1} at index [5]
Stacktrace:
[1] getindex at ./array.jl:728 [inlined]
[2] getindex(::Main.M.ShuffledVector{UnitRange{Int64},Int64}, ::Int64) at ./REPL[20]:10
[3] top-level scope at REPL[22]:1
julia> f(s, 5) # That @inbounds now affects the inner indexing calls, too!
0
You can verify that there are no branches with @code_llvm f(s, 5)
.
But, really, in this case I think it'd be much better to write this getindex method implementation with a @boundscheck
block of its own:
@inline function Base.getindex(A::ShuffledVector, i::Int)
@boundscheck checkbounds(A, i)
@inbounds r = A.data[A.shuffle[i]]
return r
end
It's a little more verbose, but now it'll actually throw the bounds error on the ShuffledVector
type instead of leaking the implementation details in the error message.
The effect of inlining
You'll notice that I don't test @inbounds
in the global scope above, and instead use these little helper functions. That's because bounds check removal only works when the method gets inlined and compiled. So simply trying to remove bounds at the global scope isn't going to work since it can't inline the function call into the interactive REPL:
julia> @inbounds getindex(4:5, 10)
ERROR: BoundsError: attempt to access 2-element UnitRange{Int64} at index [10]
Stacktrace:
[1] throw_boundserror(::UnitRange{Int64}, ::Int64) at ./abstractarray.jl:538
[2] getindex(::UnitRange{Int64}, ::Int64) at ./range.jl:617
[3] top-level scope at REPL[24]:1
There's no compilation or inlining occurring here at global scope, so Julia is unable to remove these bounds. Similarly, Julia isn't able to inline methods when there's a type instability (like when accessing a non-constant global), so it can't remove these bounds checks, either:
julia> r = 1:2;
julia> g() = @inbounds return r[3]
g (generic function with 1 method)
julia> g()
ERROR: BoundsError: attempt to access 2-element UnitRange{Int64} at index [3]
Stacktrace:
[1] throw_boundserror(::UnitRange{Int64}, ::Int64) at ./abstractarray.jl:538
[2] getindex(::UnitRange{Int64}, ::Int64) at ./range.jl:617
[3] g() at ./REPL[26]:1
[4] top-level scope at REPL[27]:1
In general, bounds-check removal should be the last optimization you make after ensuring everything else works, is well-tested, and follows the usual performance tips.
来源:https://stackoverflow.com/questions/38901275/inbounds-propagation-rules-in-julia