find duplicate addresses in database, stop users entering them early?

前端 未结 15 1258
长发绾君心
长发绾君心 2021-02-04 05:13

How do I find duplicate addresses in a database, or better stop people already when filling in the form ? I guess the earlier the better?

Is there any good way of abstra

相关标签:
15条回答
  • 2021-02-04 05:42

    I realize that the original post is specif to German addresses, but this is a good questions for addresses in general.

    In the United States, there is a part of an address called a delivery point barcode. It's a unique 12-digit number that identifies a single point of delivery and can serve as the unique identifier of an address. To get this value you'll want to use an address verification or address standardization web service API, which can cost about $20/mo depending upon the volume of requests you make to it.

    In the interest of full disclosure, I'm the founder of SmartyStreets. We offer just such an address validation web service API called LiveAddress. You're more than welcome to contact me personally with any questions you have.

    0 讨论(0)
  • 2021-02-04 05:43

    You could use the Google GeoCode API

    Wich in fact gives results for both of your examples, just tried it. That way you get structured results that you can save in your database. If the lookup fails, ask the user to write the address in another way.

    0 讨论(0)
  • 2021-02-04 05:43

    Often you use constraints in a database to ensure data to be "unique" in the data-based sense.

    Regarding "isomorphisms" I think you are on your own, ie writing the code your self. If in the database you could use a trigger.

    0 讨论(0)
  • 2021-02-04 05:47

    I'm looking for an answer addressing United States addresses

    The issue in question is prevent users from entering duplicates like

    Quellenstrasse 66/11 and Quellenstr. 66a-11

    This happens when you let your user enter the complete address in input box.

    There are some methods you can use to prevent this.

    1. Uniform formatting using RegEx

    • You can prompt users to enter the details in a uniform format.
    • That is very efficient while querying too
    • test the user entered value against some regular expressions and if failed, ask user to correct it.

    2.Use a map api like google maps and ask the user to select details from it.

    • If you choose google maps, you can achieve it using Reverse Geocoding.

    From Google Developer's guide,

    The term geocoding generally refers to translating a human-readable address into a location on a map. The process of doing the opposite, translating a location on the map into a human-readable address, is known as reverse geocoding.

    3. Allow heterogeneous data as shown in the question and compare it with different formatting.

    • In the question, the OP allow address in different format.
    • In such case, you can change it to different forms and check it with database to get a solution.
    • This may take more time and the time is completely depends on the number of test cases.

    4. Split the address into different parts and store it in db and provide such a form to user.

    • That is provide different fields to store Street, city, state etc in database.
    • Also provide the different input fields to user to enter street, city, state, etc in top down format.
    • When user enter state, narrow the query to find dupes to that state only.
    • When user enter city, narrow it to that city only.
    • When user enter the street, narrow it to that street.

    And finally

    • When user enter the address, change it to different formats and test it against Data Base.

    This is efficient even the number of test cases may high, the number of entries you test against will be very less and so it will consume very less amount of time.

    0 讨论(0)
  • 2021-02-04 05:51

    In the USA, you can use USPS Address Standardization Web Tool. It verifies and normalizes addresses for you. This way, you can normalize the address before checking if it already exists in the database. If all the addresses in the database are already normalized, you'll be able to spot duplicates easily.

    Sample URL:

    https://production.shippingapis.com/ShippingAPI.dll?API=Verify&XML=insert_request_XML_here

    Sample request:

    <AddressValidateRequest USERID="XXXXX">
      <IncludeOptionalElements>true</IncludeOptionalElements>
      <ReturnCarrierRoute>true</ReturnCarrierRoute>
      <Address ID="0">  
        <FirmName />   
        <Address1 />   
        <Address2>205 bagwell ave</Address2>   
        <City>nutter fort</City>   
        <State>wv</State>   
        <Zip5></Zip5>   
        <Zip4></Zip4> 
      </Address>      
    </AddressValidateRequest>
    

    Sample response:

    <AddressValidateResponse>
      <Address ID="0">
        <Address2>205 BAGWELL AVE</Address2>
        <City>NUTTER FORT</City>
        <State>WV</State>
        <Zip5>26301</Zip5>
        <Zip4>4322</Zip4>
        <DeliveryPoint>05</DeliveryPoint>
        <CarrierRoute>C025</CarrierRoute>
      </Address>
    </AddressValidateResponse>
    

    Other countries might have their own APIs. Other people mentioned 3rd party APIs that support multiple countries that might be useful in some cases.

    0 讨论(0)
  • 2021-02-04 05:51

    To add an answer to my own question:

    A different way of doing it is ask users for their mobile phone number, send them a text msg for verification. This stops most people messing with duplicate addresses.

    I'm talking from personal experience. (thanks pigsback !) They introduced confirmation through mobile phone. That stopped me having 2 accounts! :-)

    0 讨论(0)
提交回复
热议问题