Database of common name aliases / nicknames of people

31,281

Solution 1

A google search on "Database of Nicknames" turned up pdNickName (for pay).

In addition, I think you only need a single table for this job, not two, with NameID, Name, and MasterNameID. All the nicknames go into the Name column. One name is considered the "canonical" one. All the nickname records use the MasterNameID column to point back to that record, with the canonical name pointing to itself.

Your two table schema contains no additional information and, depending on how you fill in the nickname table, you might need extra code to handle the canonical cases.

Solution 2

I'm adding another source for anyone who comes across this question via Google. This project provides a very good lookup for this purpose.

https://github.com/carltonnorthern/nickname-and-diminutive-names-lookup

It's somewhat simpler and less complete than pdNickName but on the other hand it's free and easy to use.

Solution 3

I just found this site.

It looks like you could script it pretty easily.

http://www.behindthename.com/php/extra.php?terms=steve&extra=r&gender=m

I just wish I could auto narrow this to english..

Solution 4

Another commercial name matching database is: http://www.basistech.com/name-indexer/

It looks quite professional (though potentially expensive).

They claim to support the following languages:
Arabic, Chinese (Simplified), Chinese (Traditional), Persian (Farsi / Dari), English, Japanese, Korean, Pashto, Russian, Urdu

Solution 5

Here is a github repo with csv of related names, and you can contribute back:

The first few lines show the format:

aaron,ron
abel,abe
abednego,bedney
abijah,ab,bige
abigail,ab,abbie,abby,gail
abner,ab,abbie,abby
abraham,abe,abram,bram
absalom,ab,abbie,app
Share:
31,281
Tom Willwerth
Author by

Tom Willwerth

I'm currently a SQL DBA in the cloud software industry. Previously I've been in Consultant, Data Architect, and SaaS Operations roles. I can be found contributing on various topics such as T-SQL & .NET development, Integration & Reporting services and more.

Updated on July 09, 2022

Comments

  • Tom Willwerth
    Tom Willwerth almost 2 years

    I'm involved with a SQL / .NET project that will be searching through a list of names. I'm looking for a way to return some results on similar first names of people. If searching for "Tom" the results would include Thom, Thomas, etc. It is not important whether this be a file or a web service. Example Design:

    Table "Names" has Name and NameID
    Table "Nicknames" has Nickname, NicknameID and NameID
    

    Example output:

    You searched for "John Smith"
    You show results Jon Smith, Jonathan Smith, Johnny Smith, ...
    

    Are there any databases out there (public or paid) suited to this type of task to populate a relationship between nicknames and names?

  • Doug McClean
    Doug McClean over 14 years
    Soundex isn't really meant for first names. And beyond that, (SOUNDEX("Robert") = 'R163') != (SOUNDEX("Bob") = 'B100'), etc.
  • Tom Willwerth
    Tom Willwerth over 14 years
    Doug's point is critical here. The soundex works for Thom to Tom but not Robert to Bob.
  • Tom Willwerth
    Tom Willwerth over 14 years
    How would you go about getting all the possible patterns? Take the Robert to Bob example, I can't use "like %ob% " because that will match too many.
  • Christopher Richa
    Christopher Richa over 14 years
    In that case you would need a separate table, holding an ID for each nicknames to link the real names and nicknames together.
  • Nightwolf
    Nightwolf over 14 years
    Or Margaret to Peggy. A lookup is necessary.
  • Tom Willwerth
    Tom Willwerth over 14 years
    yes, that is my question, is there a public source of data that I could use to populate the relation between name and nickname.
  • Dustin Laine
    Dustin Laine over 14 years
    Well Robert to Bob is a good catch, but Margaret to Peggy. Come on, look at his question he asked for "similar" how is that similar. And a down vote for it, I don't think that is justified as my answer would work for his question.
  • Christopher Richa
    Christopher Richa over 14 years
    Well I have found this database: peacockdata2.com/products/pdnickname It is not free ($500) and it has an Excel sheet in the sample download that shows you a sample of the database contents.
  • Tom Willwerth
    Tom Willwerth over 14 years
    @durilai I tried to clarify as soon as possible for you and your answer made me think so I'm not the one down voting you. By similar I did not intend "similar sounding" I mean "the same" or "related"
  • Nightwolf
    Nightwolf over 14 years
    How about Edward/Ted/Theo? How about Henry/Hank? Richard/Dick? The point is that there are a lot of common nicknames that don't work by "sound", and the OP knows that so he asked for a database. If I knew of one, I would suggest one because we looked for the same thing last year.
  • Tom Willwerth
    Tom Willwerth over 14 years
    This link looks promising, you should make this a new answer
  • Tom Willwerth
    Tom Willwerth over 14 years
    Also thanks to Christopher Richa for finding this product in the comment thread below.
  • Dustin Laine
    Dustin Laine over 14 years
    @Tom Willwerth, no worries. Just thought it fit your need, before update.
  • John Mellor
    John Mellor almost 12 years
    Interesting, and they offer their database for commercial licensing, or via a free (rate-limited) API. The name detail pages clearly distinguish variants, diminutives, alternate genders, and other languages; I don't know whether the API provides the same level of detail. They seem to have better international coverage than pdNickname, though the variants seem most comprehensive for European names.
  • cowsay
    cowsay over 7 years
    Thank you. Came across this question on Google 5 years later, just as you had planned for. :)
  • C8H10N4O2
    C8H10N4O2 about 7 years
    @JohnMellor documentation for API at your link states that the function to list synonyms for a name is "not currently available"
  • C8H10N4O2
    C8H10N4O2 about 7 years
    Some of these entries are pretty questionable. For example, AARON = ERIN and BILLY = FRED
  • Bill
    Bill almost 7 years
    I used this source recently and can attest to its usefulness. Based on git commit history, the names CSV file gets updated somewhat regularly (and of course you can't beat the price).
  • user3932000
    user3932000 almost 5 years
    Unfortunately after 9 years, the link doesn't point to the database anymore.