OCR: weighted Levenshtein distance

前端 未结 3 1108
执念已碎
执念已碎 2021-02-09 04:23

I\'m trying to create an optical character recognition system with the dictionary.

In fact I don\'t have an implemented dictionary yet=)

I\'ve heard that there a

3条回答
  •  忘掉有多难
    2021-02-09 05:08

    Here is an example (C#) where weight of "replace character" operation depends on distance between character codes:

          static double WeightedLevenshtein(string b1, string b2) {
            b1 = b1.ToUpper();
            b2 = b2.ToUpper();
    
            double[,] matrix = new double[b1.Length + 1, b2.Length + 1];
    
            for (int i = 1; i <= b1.Length; i++) {
                matrix[i, 0] = i;
            }
    
            for (int i = 1; i <= b2.Length; i++) {
                matrix[0, i] = i;
            }
    
            for (int i = 1; i <= b1.Length; i++) {
                for (int j = 1; j <= b2.Length; j++) {
                    double distance_replace = matrix[(i - 1), (j - 1)];
                    if (b1[i - 1] != b2[j - 1]) {
                        // Cost of replace
                        distance_replace += Math.Abs((float)(b1[i - 1]) - b2[j - 1]) / ('Z'-'A');
                    }
    
                    // Cost of remove = 1 
                    double distance_remove = matrix[(i - 1), j] + 1;
                    // Cost of add = 1
                    double distance_add = matrix[i, (j - 1)] + 1;
    
                    matrix[i, j] = Math.Min(distance_replace, 
                                        Math.Min(distance_add, distance_remove));
                }
            }
    
            return matrix[b1.Length, b2.Length] ;
        }
    

    You see how it works here: http://ideone.com/RblFK

提交回复
热议问题