问题
I am using SOUNDEX
& DIFFERENCE
functions to do some analysis on the data present in the table.
But this function fails at below type of data. The ITEM TYPE
& ITEM SIZE
are completely different.
SELECT SOUNDEX('ITEM TYPE'), SOUNDEX('ITEM SIZE')
op:-
I350 I350
For DIFFERENCE op: - 4
I understand every analysis that human mind do can not be coded, still I would like to ask, are there exists any other functions in SQL Server
that will help me out on my next level analysis ?
回答1:
You can use an algorithm, such as Damerau–Levenshtein distance.
The Damerau–Levenshtein distance between two words is the minimum number of operations (consisting of insertions, deletions or substitutions of a single character, or transposition of two adjacent characters) required to change one word into the other.
There are T-SQL implementations, such as this one by Steve Hatchett. Alternatively, you can use an implementation in C#, compile a DLL and load it into SQL CLR. Compiled version should be faster.
More info on loading CLR assemblies into SQL @ CLR Assembly C# inside SQL Server.
来源:https://stackoverflow.com/questions/43389034/beyond-soundex-difference-sql-server