find duplicate addresses in database, stop users entering them early?

前端 未结 15 1268
长发绾君心
长发绾君心 2021-02-04 05:13

How do I find duplicate addresses in a database, or better stop people already when filling in the form ? I guess the earlier the better?

Is there any good way of abstra

15条回答
  •  南方客
    南方客 (楼主)
    2021-02-04 05:56

    Machine learning and AI has algorithms to find string similarities and duplicate measures.

    Record linkage or the task of matching equivalent records that differ syntactically—was first explored in the late 1950s and 1960s.

    You can represent every pair of records using a vector of features that describe the similarity between individual record fields.

    For example, Adaptive Duplicate Detection Using Learnable String Similarity Measures. for example, read this doc

    1. You can use generic or manually tuned distance metrics for estimating the similarity of potential duplicates.

    2. You can use adaptive name matching algorithms, like Jaro metric, which is based on the number and order of common characters between two strings.

    3. Token-based and hybrid distance. In such cases, we can convert the strings s and t to token multisets (where each token is a word) and consider similarity metrics on these multisets.

提交回复
热议问题