How do I find duplicate addresses in a database, or better stop people already when filling in the form ? I guess the earlier the better?
Is there any good way of abstra
Before you start searching for duplicate addresses in your database, you should first make sure you store the addresses in a standard format.
Most countries have a standard way of formatting addresses, in the US it's the USPS CASS system: http://www.usps.com/ncsc/addressservices/certprograms/cass.htm
But most other countries have a similar service/standard. Try this site for more international formats: http://bitboost.com/ref/international-address-formats.html
This not only helps in finding duplicates, but also saves you money when mailing you customers (the postal service charges less if the address is in a standard format).
Depending on your application, in some cases you might want to store a "vanity" address record as well as the standard address record. This keeps your VIP customers happy. A "vanity" address might be something like:
62 West Ninety First Street
Apartment 4D
Manhattan, New York, NY 10001
While the standard address might look like this:
62 W 91ST ST APT 4D
NEW YORK NY 10024-1414