I\'m involved with a SQL / .NET project that will be searching through a list of names. I\'m looking for a way to return some results on similar first names of people. If se
Another commercial name matching database is: http://www.basistech.com/name-indexer/
It looks quite professional (though potentially expensive).
They claim to support the following languages:
Arabic, Chinese (Simplified), Chinese (Traditional), Persian (Farsi / Dari), English, Japanese, Korean, Pashto, Russian, Urdu
Here is a github repo with csv of related names, and you can contribute back:
The first few lines show the format:
aaron,ron
abel,abe
abednego,bedney
abijah,ab,bige
abigail,ab,abbie,abby,gail
abner,ab,abbie,abby
abraham,abe,abram,bram
absalom,ab,abbie,app
Similar format as Stan James's csv, but folded two ways for lookups: Name to nickname: https://github.com/MrCsabaToth/SOEMPI/blob/master/openempi/conf/name_to_nick.csv Nickname to name: https://github.com/MrCsabaToth/SOEMPI/blob/master/openempi/conf/nick_to_name.csv
To select similar sounding name use: (see MSDN)
SELECT SOUNDEX ('Tom')
A google search on "Database of Nicknames" turned up pdNickName (for pay).
In addition, I think you only need a single table for this job, not two, with NameID, Name, and MasterNameID. All the nicknames go into the Name column. One name is considered the "canonical" one. All the nickname records use the MasterNameID column to point back to that record, with the canonical name pointing to itself.
Your two table schema contains no additional information and, depending on how you fill in the nickname table, you might need extra code to handle the canonical cases.
There is a database out there called pdNicknames (found at http://www.peacockdata2.com/products/pdnickname/). It contains everything you need, at a cost of $500.