Implementing a best match search in Java

后端 未结 2 1258
北海茫月
北海茫月 2021-02-08 04:54

I am trying to get a best match string matching to work using existing Java data structures. It is quite slow though, any suggestions to improve its performance will be welcomed

2条回答
  •  逝去的感伤
    2021-02-08 05:29

    I prefer the TreeMap answer, but for completeness the same algorithm, now with binary search.

    String[][] data = {
            { "0060175559138", "VIP" },           // <-- found insert position
            { "00601755511", "International" },   // <-- skipped
            { "00601755510", "International" },   // <-- skipped
            { "006017555", "National" },          // <-- final find
            { "006017", "Local" },
            { "0060", "X" },
    };
    Comparator comparator = (lhs, rhs) -> lhs[0].compareTo(rhs[0]);
    Arrays.sort(data, comparator);
    
    String searchKey = "0060175552020";
    int ix = Arrays.binarySearch(data, new String[] { searchKey }, comparator);
    if (ix < 0) {
        ix = ~ix; // Not found, insert position
        --ix;
        while (ix >= 0) {
            if (searchKey.startsWith(data[ix][0])) {
                break;
            }
            if (searchKey.compareTo(data[ix][0]) < 0) {
                ix = -1; // Not found
                break;
            }
            --ix;
        }
    }
    if (ix == -1) {
        System.out.println("Not found");
    } else {
        System.out.printf("Found: %s - %s%n", data[ix][0], data[ix][1]);
    }
    

    This algorithm is first logarithmic, and then does a loop. If there are no skipped entries, logarithmic time: fine. So the question is, how many entries need to be skipped.

    If you store at every element a reference to its prefix: from { "00601755511", "International" }, to { "006017555", "National" }, then you would only need to follow the prefix back links.

提交回复
热议问题