soundex | 易学教程

Levenshtein distance based methods Vs Soundex

阅读更多关于 Levenshtein distance based methods Vs Soundex

As per this comment in a related thread, I'd like to know why Levenshtein distance based methods are better than Soundex. Soundex is rather primitive - it was originally developed to be hand calculated. It results in a key that can be compared. Soundex works well with western names, as it was originally developed for US census data. It's intended for phonetic comparison. Levenshtein distance looks at two values and produces a value based on their similarity. It's looking for missing or substituted letters. Basically Soundex is better for finding that "Schmidt" and "Smith" might be the same

Finding similar sounding text in VBA [closed]

阅读更多关于 Finding similar sounding text in VBA [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . My manager tells me that there is a way to evaluate names that are spelled differently but sound similar in the way they are pronounced. Ideally, we want to be able to evaluate a user-entered search name and return exact matches as well as "similar sounding" names. He called the process "Soundits" but I cannot

Soundex algorithm in Python (homework help request)

阅读更多关于 Soundex algorithm in Python (homework help request)

问题 The US census bureau uses a special encoding called “soundex” to locate information about a person. The soundex is an encoding of surnames (last names) based on the way a surname sounds rather than the way it is spelled. Surnames that sound the same, but are spelled differently, like SMITH and SMYTH, have the same code and are filed together. The soundex coding system was developed so that you can find a surname even though it may have been recorded under various spellings. In this lab you

Enabling soundex/metaphone for non-English characters

阅读更多关于 Enabling soundex/metaphone for non-English characters

问题 I've been studying soundex, metaphone and other string search techniques the past few days, and in my understanding both algorithms work well in handling non-English words transliterated to English. However the requirement that I have would be for such search to work in the original, untransliterated languages, accomodating alphabets such as German, Norwegian, and even Cyrilic alphabets. Are there any search algorithms capable of handling these alphabets completely? Or am I better off using