Search suggestion in strings

六眼飞鱼酱① 提交于 2019-12-11 04:01:27

问题


I have a text file containing: mariam amr sara john jessy salma mkkkkkaooooorllll

the user enters a word to search for: for example: maram

As you can see, it does not exist in my text file .. I want to give suggestions, similar to the word maram is mariam

I used longest common subsequence but it gives mariam and mkkkkkaooooorllll because both contain the Longest common subsequence "mar"

I want to force the choice of mariam only Any ideas ?

Thanks in advance

/**
 ** Java Program to implement Longest Common Subsequence Algorithm
 **/

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;

   /** Class  LongestCommonSubsequence **/
    public class  LongestCommonSubsequence
    {    
   /** function lcs **/
    public String lcs(String str1, String str2)
    {
    int l1 = str1.length();
    int l2 = str2.length();

    int[][] arr = new int[l1 + 1][l2 + 1];

    for (int i = l1 - 1; i >= 0; i--)
    {
        for (int j = l2 - 1; j >= 0; j--)
        {
            if (str1.charAt(i) == str2.charAt(j))
                arr[i][j] = arr[i + 1][j + 1] + 1;
            else 
                arr[i][j] = Math.max(arr[i + 1][j], arr[i][j + 1]);
        }
    }

    int i = 0, j = 0;
    StringBuffer sb = new StringBuffer();
    while (i < l1 && j < l2) 
    {
        if (str1.charAt(i) == str2.charAt(j)) 
        {
            sb.append(str1.charAt(i));
            i++;
            j++;
        }
        else if (arr[i + 1][j] >= arr[i][j + 1]) 
            i++;
        else
            j++;
    }


    return sb.toString(); 
   //read text file, if a word contains sb.toString() , print it


}

/** Main Function **/
public static void main(String[] args) throws IOException
{    
    BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
    System.out.println("Longest Common Subsequence Algorithm Test\n");

    System.out.println("\nEnter string 1");
    String str1 = br.readLine();

    System.out.println("\nEnter string 2");
    String str2 = br.readLine();

    LongestCommonSubsequence obj = new LongestCommonSubsequence(); 
    String result = obj.lcs(str1, str2);

    System.out.println("\nLongest Common Subsequence : "+ result);
}

}


回答1:


There are a few techniques for fuzzy matching like this - Apache Commons provides some excellent tools for comparing how similar two strings are to one another. Check out the javadoc for Levenshtein Distance and Jaro Winkler Distance calculation methods.

With Levenshtein Distance, the lower the score, the more similar the strings are:

StringUtils.getLevenshteinDistance("frog", "fog") == 1
StringUtils.getLevenshteinDistance("fly", "ant") == 3

You could also consider calculating the Double Metaphone for each string - this will allow you to determine how similar the strings 'sound' when spoken, even if they aren't necessarily spelt similarly.

Back to your question - using these tools, you could throw up suggestions if the user's search term is within a certain threshold of any of the strings in your text file.



来源:https://stackoverflow.com/questions/31159227/search-suggestion-in-strings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!