ALGORITHM - String similarity score/hash

后端 未结 8 1282
遇见更好的自我
遇见更好的自我 2021-02-01 10:16

Is there a method to calculate something like general \"similarity score\" of a string? In a way that I am not comparing two strings together but rather I get some number/scores

8条回答
  •  -上瘾入骨i
    2021-02-01 10:38

    You can always use Levenshtein distance, also, there is a written implementation for that: http://code.google.com/p/pylevenshtein/

    But, for simplicity, you can use builtin difflib module:

    >>> import difflib
    >>> l
    {'Hello Earth', 'Hello World!', 'Foo Bar!', 'Foo world!', 'Foo bar', 'Hello World', 'FooBarbar'}
    >>> difflib.get_close_matches("Foo World", l)
    ['Foo world!', 'Hello World', 'Hello World!']
    

    http://docs.python.org/library/difflib.html#difflib.get_close_matches

提交回复
热议问题