Levenshtein distance: how to better handle words swapping positions?

前端 未结 9 1319
忘掉有多难
忘掉有多难 2021-01-30 02:22

I\'ve had some success comparing strings using the PHP levenshtein function.

However, for two strings which contain substrings that have swapped positions, the algorithm

9条回答
  •  野趣味
    野趣味 (楼主)
    2021-01-30 02:58

    Take this answer and make the following change:

    void match(trie t, char* w, string s, int budget){
      if (budget < 0) return;
      if (*w=='\0') print s;
      foreach (char c, subtrie t1 in t){
        /* try matching or replacing c */
        match(t1, w+1, s+c, (*w==c ? budget : budget-1));
        /* try deleting c */
        match(t1, w, s, budget-1);
      }
      /* try inserting *w */
      match(t, w+1, s + *w, budget-1);
      /* TRY SWAPPING FIRST TWO CHARACTERS */
      if (w[1]){
        swap(w[0], w[1]);
        match(t, w, s, budget-1);
        swap(w[0], w[1]);
      }
    }
    

    This is for dictionary search in a trie, but for matching to a single word, it's the same idea. You're doing branch-and-bound, and at any point, you can make any change you like, as long as you give it a cost.

提交回复
热议问题