问题
In julia we can check if an array contains a value, like so:
> 6 in [4,6,5]
true
However this returns false, when attempting to check for a sub-array in a specific order:
> [4,6] in [4,6,5]
false
What is the correct syntax to verify if a specific sub-array exists in an array?
回答1:
For the third condition i.e. vector [4,6]
appears as a sub-vector of 4,6,5
the following function is suggested:
issubvec(v,big) =
any([v == slice(big,i:(i+length(v)-1)) for i=1:(length(big)-length(v)+1)])
For the second condition, that is, give a boolean for each element in els
vectors which appears in set
vector, the following is suggested:
function vecin(els,set)
res = zeros(Bool,size(els))
res[findin(els,set)]=true
res
end
With the vector in the OP, these result in:
julia> vecin([4,6],[4,6,5])
2-element Array{Bool,1}:
true
true
julia> issubvec([4,6],[4,6,5])
true
回答2:
It takes a little bit of code to make a function that performs well, but this is much faster than the issubvec
version above:
function subset2(x,y)
lenx = length(x)
first = x[1]
if lenx == 1
return findnext(y, first, 1) != 0
end
leny = length(y)
lim = length(y) - length(x) + 1
cur = 1
while (cur = findnext(y, first, cur)) != 0
cur > lim && break
beg = cur
@inbounds for i = 2:lenx
y[beg += 1] != x[i] && (beg = 0 ; break)
end
beg != 0 && return true
cur += 1
end
false
end
Note: it would also be much more useful if the function actually returned the position of the beginning of the subarray if found, or 0 if not, similarly to the findfirst/findnext functions.
Timing information (the second one is using my subset2 function):
0.005273 seconds (65.70 k allocations: 4.073 MB)
0.000086 seconds (4 allocations: 160 bytes)
回答3:
I think it is worth mentioning that in Julia 1.0 you have the function issubset
> issubset([4,6], [4,6,5])
true
You can also quite conveniently call it using the \subseteq
latex symbol
> [4,6] ⊆ [4,6,5]
true
This looks pretty optimized to me:
> using Random
> x, y = randperm(10^3)[1:10^2], randperm(10^3);
> @btime issubset(x, y);
16.153 μs (12 allocations: 45.96 KiB)
回答4:
I used this recently to find subsequences in arrays of integers. It's not as good or as fast as @scott's subset2(x,y)
... but it returns the indices.
function findsequence(arr::Array{Int64}, seq::Array{Int64})
indices = Int64[]
i = 1
n = length(seq)
if n == 1
while true
occurrence = findnext(arr, seq[1], i)
if occurrence == 0
break
else
push!(indices, occurrence)
i = occurrence +1
end
end
else
while true
occurrence = Base._searchindex(arr, seq, i)
if occurrence == 0
break
else
push!(indices, occurrence)
i = occurrence +1
end
end
end
return indices
end
julia> @time findsequence(rand(1:9, 1000), [2,3])
0.000036 seconds (29 allocations: 8.766 KB)
16-element Array{Int64,1}:
80
118
138
158
234
243
409
470
539
589
619
629
645
666
762
856
回答5:
note that you can now vectorize in
with a dot:
julia> in([4,6,5]).([4, 6])
2-element BitArray{1}:
true
true
and chain with all
to get your answer:
julia> all(in([4,6,5]).([4, 6]))
true
来源:https://stackoverflow.com/questions/36346005/julia-does-an-array-contain-a-specific-sub-array