Best way to handle large (UUID) as a MySQL table primary key

36,901

Solution 1

For most cases it is best to store UUIDs/GUIDs as BINARY(16). See these related StackOverflow questions:

The conversion can (and probably should) be done in MySQL instead of PHP so whether you're using a 32bit PHP client or 64bits doesn't matter a bit (pun intended :P)

Solution 2

Use a string type, not an integer. Better is only better if it solves a problem.

If you're really concerned about lookup speed, use a synthetic (auto increment) primary key. You can place a unique constraint on the UUID column, and use it only once to look up the synthetic key which is subsequently used for your joins etc.

Share:
36,901
Aaron Murray
Author by

Aaron Murray

Computer technician by day, closet php developer by night.

Updated on March 03, 2020

Comments

  • Aaron Murray
    Aaron Murray about 4 years

    I have a UUID string that I want to use as my MySQL tables primary key, the UUID is a 32 character hexadecimal string (after '-' characters are stripped). Due to the fact that it is better to use a numeric column (int) as your primary key in a database, I would like to convert this to an integer but not sure of the best way to handle it.

    1. Due to the size of the string (ie. uuid='a822ff2bff02461db45ddcd10a2de0c2'), do I need to break this into multiple 'substrings'.
    2. I am running PHP on a 32 bit architecture at the moment so converting it within PHP will not work due to PHP_INT_MAX size (max 0xFFFFFFFF). And I suspect that would be the same restriction for MySQL.
    3. I do not like the idea of multiple primary keys as a fix for this, I would rather use a string representation even though that's not the preferred method.

    I might be thinking about this all wrong, and am not against reading documentation, so either examples or suggested reading as a response would be acceptable.

    • Aaron Murray
      Aaron Murray about 11 years
      Also to note, this id field would be used for both joins and selects.
  • Aaron Murray
    Aaron Murray about 11 years
    I have used strings in past projects, and I believe (in production) would only be talking several million records as a high number, so I would not think this to be a large performance concern (using strings), I am not quite sure at what size the performance hit would come into play with using strings vs numeric fields. Based on the size, I am not entirely sure a synthetic primary key would be advantageous in a millions of records (say 10 million tops) scenario.
  • Aaron Murray
    Aaron Murray about 11 years
    So what you are saying is there is no real way to pack / store this in a numeric representation easily, my options are strings and strings with synthetic auto-increment keys.
  • PaulProgrammer
    PaulProgrammer about 11 years
    It's not a problem until it's a problem. Production quality databases (mysql included) have very clever indexing algorithms for handling string queries. If you're not concerned about performance yet, then why are you bashing your head against the max integer size wall in your original question?
  • PaulProgrammer
    PaulProgrammer about 11 years
    Sure, you can pack / store this as an integer (probably by using multiple field PKs, which you said you don't like (not sure why)), but why go to the trouble, and incur the maintenance overhead going forward if it doesn't solve a real-world problem?
  • PaulProgrammer
    PaulProgrammer about 11 years
    I like the answer @Hazzit gives better. Use that.
  • Aaron Murray
    Aaron Murray about 11 years
    I didn't say that I wasn't concerned about performance, that was ultimately my biggest reason for the question really. Adding a synthetic key (or multiple primary keys), the biggest concern is the additional programming (overhead) on the back-end required. I want to build an optimized database structure from the beginning so that I don't have to worry about performance issues down the road.
  • Aaron Murray
    Aaron Murray about 11 years
    If 10M records with relatively simple QUERY's and JOIN's is not going to take a performance hit with a CHAR(32) then I'm good with that, if there is a better way ie. BINARY(16) then I would rather do it right the first time rather than fix the problem later.
  • Aaron Murray
    Aaron Murray about 11 years
    That sounds exactly what I was looking for. The GUID/UUID is not generated by me, it is pulled from a third party source. It is unique and serves well as a primary key. @PaulProgrammer I do appreciate your speedy replies too, I have +1'ed you both. Thank you for the help in these answers.
  • nickdnk
    nickdnk over 9 years
    @Hazzit - do you know if it's legal to bind the UUID as a string to a bind_param method in MySQLi and just point it to a binary(16) field? I'm having trouble figuring out if I need to do any conversions first.