KMP prefix table

前端 未结 7 1030
说谎
说谎 2020-12-02 07:58

I am reading about KMP for string matching.
It needs a preprocessing of the pattern by building a prefix table.
For example for the string ababaca

相关标签:
7条回答
  • 2020-12-02 08:22
        String string = "abababca";
        int[]array = new int[string.length()];
    
        int i = 1;
        int j = 0;
    
        while(i<string.length()) {
            // if the character are matching the increment the j and i 
            if(string.charAt(j)==string.charAt(i)) {
                array[i] = array[i-1]+1;
                i++;
                j++;
            }else {
    
                // if not then move j to array[j-1] position and increment i 
                if(j!=0) {
                    j = array[j-1];
                }
                i++;
            }   
        }
    
        for(int k :array) {
            System.out.print(k+" ");
        }
    
    0 讨论(0)
  • 2020-12-02 08:26

    This code may not be the shortest, but easy to understand flow of code. Simple Java Code for calculating prefix-Array-

        String pattern = "ababaca";
        int i = 1, j = 0;
        int[] prefixArray = new int[pattern.length];
        while (i < pattern.length) {
    
            while (pattern.charAt(i) != pattern.charAt(j) && j > 0) {
                j = prefixArray[j - 1];
    
            }
            if (pattern.charAt(i) == pattern.charAt(j)) {
                prefixArray[i] = j + 1;
                i++;
                j++;
    
            } else {
                prefixArray[i] = j;
                i++;
            }
        }
    
        for (int k = 0; k < prefixArray.length; ++k) {
            System.out.println(prefixArray[k]);
        }
    

    It produces the required output-

    0 0 1 2 3 0 1

    0 讨论(0)
  • 2020-12-02 08:26

    string text = "ababbabbababbababbabb"; static int arr[30];

    int i = 1;
    while (i < text.length())
    {
        int j = 0;
        int value = 0;
        while (((i + j) < text.length()) && (text[j] == text[i + j]))
            val[i + j] = ++value, j++;
        i += j + 1;
    }
    

    required output stored in val[]

    0 讨论(0)
  • 2020-12-02 08:27

    Every number belongs to corresponding prefix ("a", "ab", "aba", ...) and for each prefix it represents length of longest suffix of this string that matches prefix. We do not count whole string as suffix or prefix here, it is called self-suffix and self-prefix (at least in Russian, not sure about English terms).

    So we have string "ababaca". Let's look at it. KMP computes Prefix Function for every non-empty prefix. Let's define s[i] as the string, p[i] as the Prefix function. prefix and suffix may overlap.

    +---+----------+-------+------------------------+
    | i |  s[0:i]  | p[i]  | Matching Prefix/Suffix |
    +---+----------+-------+------------------------+
    | 0 | a        |     0 |                        |
    | 1 | ab       |     0 |                        |
    | 2 | aba      |     1 | a                      |
    | 3 | abab     |     2 | ab                     |
    | 4 | ababa    |     3 | aba                    |
    | 5 | ababac   |     0 |                        |
    | 6 | ababaca  |     1 | a                      |
    |   |          |       |                        |
    +---+----------+-------+------------------------+
    

    Simple C++ code that computes Prefix function of string S:

    vector<int> prefixFunction(string s) {
        vector<int> p(s.size());
        int j = 0;
        for (int i = 1; i < (int)s.size(); i++) {
            while (j > 0 && s[j] != s[i])
                j = p[j-1];
    
            if (s[j] == s[i])
                j++;
            p[i] = j;
        }   
        return p;
    }
    
    0 讨论(0)
  • 2020-12-02 08:27

    String pattern = "ababaca";

    int i = 1, j = 0;
    
    int[] prefixArray = new int[pattern.length];
    
    while (i < pattern.length) {
    
        while (pattern.charAt(i) != pattern.charAt(j) && j > 0) {
            j = prefixArray[j - 1];
    
        }
    
        if (pattern.charAt(i) == pattern.charAt(j)) {
            prefixArray[i] = j + 1;
            i++;
            j++;
    
        } else {
            prefixArray[i] = j;
            i++;
        }
    }
    
    for (int k = 0; k < prefixArray.length; ++k) {
        cout<< prefixArray[k]<< endl;
    }
    
    0 讨论(0)
  • 2020-12-02 08:35

    Python Implementation

    p='ababaca'
    
    l1 = len(p)
    
    j = 0
    i = 1
    prefix = [0]
    
    while len(prefix) < l1:
        if p[j] == p[i]:
            prefix.append(j+1)
            i += 1
            j += 1
        else:
            if j == 0:
                prefix.append(0)
                i += 1
            if j != 0:
                j = prefix[j-1]
    
    print prefix
    
    0 讨论(0)
提交回复
热议问题