Julia: does an Array contain a specific sub-array

a 夏天 提交于 2019-12-10 14:38:42

问题


In julia we can check if an array contains a value, like so:

> 6 in [4,6,5]
true

However this returns false, when attempting to check for a sub-array in a specific order:

> [4,6] in [4,6,5]
false

What is the correct syntax to verify if a specific sub-array exists in an array?


回答1:


For the third condition i.e. vector [4,6] appears as a sub-vector of 4,6,5 the following function is suggested:

issubvec(v,big) = 
  any([v == slice(big,i:(i+length(v)-1)) for i=1:(length(big)-length(v)+1)])

For the second condition, that is, give a boolean for each element in els vectors which appears in set vector, the following is suggested:

function vecin(els,set)
  res = zeros(Bool,size(els))
  res[findin(els,set)]=true
  res
end

With the vector in the OP, these result in:

julia> vecin([4,6],[4,6,5])
2-element Array{Bool,1}:
 true
 true

julia> issubvec([4,6],[4,6,5])
true



回答2:


It takes a little bit of code to make a function that performs well, but this is much faster than the issubvec version above:

function subset2(x,y)
    lenx = length(x)
    first = x[1]
    if lenx == 1
        return findnext(y, first, 1) != 0
    end
    leny = length(y)
    lim = length(y) - length(x) + 1
    cur = 1
    while (cur = findnext(y, first, cur)) != 0
        cur > lim && break
        beg = cur
        @inbounds for i = 2:lenx
            y[beg += 1] != x[i] && (beg = 0 ; break)
        end
        beg != 0 && return true
        cur += 1
    end
    false
end

Note: it would also be much more useful if the function actually returned the position of the beginning of the subarray if found, or 0 if not, similarly to the findfirst/findnext functions.

Timing information (the second one is using my subset2 function):

  0.005273 seconds (65.70 k allocations: 4.073 MB)
  0.000086 seconds (4 allocations: 160 bytes)



回答3:


I think it is worth mentioning that in Julia 1.0 you have the function issubset

> issubset([4,6], [4,6,5])
true

You can also quite conveniently call it using the \subseteq latex symbol

> [4,6] ⊆ [4,6,5]
true

This looks pretty optimized to me:

> using Random

> x, y = randperm(10^3)[1:10^2], randperm(10^3);

> @btime issubset(x, y);
16.153 μs (12 allocations: 45.96 KiB)



回答4:


I used this recently to find subsequences in arrays of integers. It's not as good or as fast as @scott's subset2(x,y)... but it returns the indices.

function findsequence(arr::Array{Int64}, seq::Array{Int64})
    indices = Int64[]
    i = 1
    n = length(seq)
    if n == 1
        while true
            occurrence = findnext(arr, seq[1], i)
            if occurrence == 0
                break
            else
                push!(indices, occurrence)
                i = occurrence +1
            end
        end
    else
        while true
            occurrence = Base._searchindex(arr, seq, i)
            if occurrence == 0
                break
            else
                push!(indices, occurrence)
                i = occurrence +1
            end
        end
    end
    return indices
end

julia> @time findsequence(rand(1:9, 1000), [2,3])
    0.000036 seconds (29 allocations: 8.766 KB)
    16-element Array{Int64,1}:
   80
  118
  138
  158
  234
  243
  409
  470
  539
  589
  619
  629
  645
  666
  762
  856



回答5:


note that you can now vectorize in with a dot:

julia> in([4,6,5]).([4, 6])
2-element BitArray{1}:
 true
 true

and chain with all to get your answer:

julia> all(in([4,6,5]).([4, 6]))
true


来源:https://stackoverflow.com/questions/36346005/julia-does-an-array-contain-a-specific-sub-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!