What is the best algorithm for matching two string containing less than 10 words in latin script

前端 未结 5 1829
刺人心
刺人心 2021-02-04 09:35

I\'m comparing song titles, using Latin script (although not always), my aim is an algorithm that gives a high score if the two song titles seem to be the same same title and a

5条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-02-04 10:27

    Interesting. Have you thought about a radix sort?

    http://en.wikipedia.org/wiki/Radix_sort

    The concept behind the radix sort is that it is a non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits. If you convert your string into an array of characters, which will be a number no greater than 3 digits, then your k=3(maximum number of digits) and you n = number of string to compare. This will sort the first digits of all your strings. Then you will have another factor s=the length of the longest string. your worst case scenario for sorting would be 3*n*s and the best case would be (3 + n) * s. Check out some radix sort examples for strings here:

    http://algs4.cs.princeton.edu/51radix/LSD.java.html

    http://users.cis.fiu.edu/~weiss/dsaajava3/code/RadixSort.java

提交回复
热议问题