Given the task of storing international geographic addresses in a relational table, what is the most flexible schema? Should every part of the address be broken out into the
That depends on what you want to do with it.
I've found it always easier to use addresses for other purposes (such as verification against USPS data or getting shipping rates from UPS/FEDEX) if they're separated.
Here's what I typically use for addresses:
In Response to the edit: For most situations I don't see the use. The table I listed above has enough fields (and is generic enough) for most country's addresses.
Comment of Ben Alabaster's Answer: To format addresses based on country, you could use a formatting table that has the ordering of the columns for each country as separate rows.
The field order can be coded to use complex grid layouts also.
There is no point in separating addresses by country. This will be chaotic as the number of countries increases and you will land in trouble if you want to find all the addresses of say, an international client. Having an Address Type suggested by Ben could also lead to ambiguities when you have an address that has both a building number and an apartment number. I could be in an apartment complex where each building has a different name. This is very common in India.
Be careful not to over-analyze address formats. When you do, you're quite likely to end up with a specification most users will need to work around, effectively forcing them to use the wrong fields, or only filling the primary fields and ignoring the extra fields.
Keep things simple.
A StreetType like mentioned by BenAlabaster will cause problems when you start working with languages different from isolating languages like English or Spanish.
To show you how bad things can get in the wild: the "Henriette Roland Holststraat" in Amsterdam, built up from "Henriette" + "Roland Holst" + "straat", which can be abbreviated as the "Roland Holststraat", or "Roland Holststr.", or misspelled as "H.R.Holststr." or "Henriette Roland-Holst straat", depending on the weather. Unless you've got an up-to-date street register for each country on earth, you'll be going nowhere.
And finally, be careful that in some multilingual countries, names can be different from one language to another! For instance in Brussels where many streets have both a French and a Dutch name: "Avenu du Port" and "Havenlaan", depending on the addressee's preferred language. (Google Maps shows both names alternately, just to be on the safe side.)
You can try to devise all kinds of clever tricks here, but are the sales reps. going to understand this?
I use https://github.com/commerceguys/addressing library to format international addresses and they use these elements:
Country
Administrative area
Locality (City)
Dependent Locality (in: BR, CN, IR, MY, MX, NZ, PH, KR, ZA, TH)
Postal code
Sorting code
Address line 1
Address line 2
Organization
Recipient
This doen't help if you want to parse the street (name, house number, ...).
Btw. if you are looking for a multilanguage country list: https://github.com/umpirsky/country-list
The only way is to split them to:
Name varchar,
Title varchar,
StreetAddress varchar,
StreetAddressLine2 varchar,
zipCode varchar,
City varchar,
Province varchar,
Country lookup
since almost every country has it's own standard for having address data, and evey country has a different format of zipcodes.
You can have a small sample of problems in my post from a similiar question.
This should not make sense to separate addresses for every country, since there are countries where you have few address conventions. Some popular conventions include not having streets in small villages, only village name and number, while streets are in larger cities’ addresses. I have learned that in Hungary’s capital – Budapest, there are few streets having the same name (you distinct them by city’s district number), while other cities does not have such addresses (someone from Hungary may actually confirm if this is true). So the total number of address formats will be numer_of_countries multiplied by number of address formats in this country… Can be done with different tables, but it will be horrible work to do.
As a polar opposite to the excellent answer @BenAlabaster has provided, you could simply have:
address TEXT(300)
postal_code VARCHAR(15)
country_code VARCHAR(2)
Your client-side form layouts can still be as complex as you see fit (or use a multi-line input where the user can manually type their address). You can then add the line breaks in the address where necessary.
Your country table would look as follows:
country_code VARCHAR(2)
country_name VARCHAR(255)
Additionally, you could have one of the following:
postal_code_required TINYINT(1)
postal_code_regex VARCHAR(255) NULL DEFAULT NULL
Then use the following lists to design your country table: