I'm looking for an answer addressing United States addresses
The issue in question is prevent users from entering duplicates like
Quellenstrasse 66/11
and
Quellenstr. 66a-11
This happens when you let your user enter the complete address in input box.
There are some methods you can use to prevent this.
1. Uniform formatting using RegEx
- You can prompt users to enter the details in a uniform format.
- That is very efficient while querying too
- test the user entered value against some regular expressions and if failed, ask user to correct it.
2.Use a map api like google maps and ask the user to select details from it.
- If you choose google maps, you can achieve it using Reverse Geocoding.
From Google Developer's guide,
The term geocoding generally refers to translating a human-readable address into a location on a map. The process of doing the opposite, translating a location on the map into a human-readable address, is known as reverse geocoding.
3. Allow heterogeneous data as shown in the question and compare it with different formatting.
- In the question, the OP allow address in different format.
- In such case, you can change it to different forms and check it with database to get a solution.
- This may take more time and the time is completely depends on the number of test cases.
4. Split the address into different parts and store it in db and provide such a form to user.
- That is provide different fields to store Street, city, state etc in database.
- Also provide the different input fields to user to enter street, city, state, etc in top down format.
- When user enter state, narrow the query to find dupes to that state only.
- When user enter city, narrow it to that city only.
- When user enter the street, narrow it to that street.
And finally
- When user enter the address, change it to different formats and test it against Data Base.
This is efficient even the number of test cases may high, the number of entries you test against will be very less and so it will consume very less amount of time.