C# Array Subset fetching

后端 未结 3 1761
清酒与你
清酒与你 2021-01-23 00:30

I have an array of bytes and i want to determine if the contents of this array of bytes exists within another larger array as a continuous sequence. What is the simplest way to

相关标签:
3条回答
  • 2021-01-23 00:38

    Try to adapt some string search algorithm. One of the fastest is Boyer-Moore . It's quite easy as well. For binary data, Knuth-Morris-Pratt algorithm might work very efficiently as well.

    0 讨论(0)
  • 2021-01-23 00:42

    This, which is a 1/1 port of this answer: Searching for a sequence of Bytes in a Binary File with Java

    Is a very efficient way of doing so:

    public static class KmpSearch {
    
        public static int IndexOf(byte[] data, byte[] pattern) {
            int[] failure = ComputeFailure(pattern);
    
            int j = 0;
            if (data.Length == 0) return -1;
    
            for (int i = 0; i < data.Length; i++) {
                while (j > 0 && pattern[j] != data[i]) {
                    j = failure[j - 1];
                }
                if (pattern[j] == data[i]) { j++; }
                if (j == pattern.Length) {
                    return i - pattern.Length + 1;
                }
            }
            return -1;
        }
    
    
        private static int[] ComputeFailure(byte[] pattern) {
            int[] failure = new int[pattern.Length];
    
            int j = 0;
            for (int i = 1; i < pattern.Length; i++) {
                while (j > 0 && pattern[j] != pattern[i]) {
                    j = failure[j - 1];
                }
                if (pattern[j] == pattern[i]) {
                    j++;
                }
                failure[i] = j;
            }
    
            return failure;
        }
    }
    
    0 讨论(0)
  • 2021-01-23 00:51

    The naive approach is:

    public static bool IsSubsetOf(byte[] set, byte[] subset) {
        for(int i = 0; i < set.Length && i + subset.Length <= set.Length; ++i)
            if (set.Skip(i).Take(subset.Length).SequenceEqual(subset))
                return true;
        return false;
    }
    

    For more efficient approaches, you might consider more advanced string matching algorithms like KMP.

    0 讨论(0)
提交回复
热议问题