Is Mysql UUID_SHORT() comparable to UUID()

12,941

Solution 1

uuid_short() produces a bitwise conglomeration of the server ID, a fairly static time component, and a sequentially increasing 24 bit integer. These bits are stuffed into an 8 byte integer. The time component is based on the server's boot time.

uuid() produces hex string that represents a 16 byte version1 UUID. Version 1 UUIDs are a bitwise conglomeration of the server ID, the current timestamp, a few bytes that come into play if you generation IDs at hyperspeed, and a few utility bits.

To answer your question: does uuid_short provide time and space uniqueness that rivals uuid? The answer is no. Case in point, the server ID in a uuid_short is only one byte. So if you have 256 or more servers, at least a few of them will have the same node id, which means you lose space uniqueness. For comparison, the server ID in version 1 UUID is 6 bytes long, effectively killing the chance of duplicates for all but the largest of corporate server farms :)

A better question is whether uuid_short is good enough. You could see ID collisions if you :

  1. Generate more than 16 million IDS from the same server in little time. ***
  2. Boot servers with the same server ID all at exactly the same time, and share data between them.
  3. Fiddle with the system clock and then reboot your server.

The second issue seems unlikely for most people, but the first one is worth investigating before you commit to making uuid_short the basis of your keys.

*** Based on the mysql docs for uuid_short, it seems like you would see collisions if you generated more than 16 million IDs during the uptime of a single server. But that would be silly. The mysql docs go on to say that you are fine as long as you don't generate 16 million IDs per second. That implies that they must bump some of the bits in timestamp if you exhaust the 16 million sequential IDs. I have not tested this.

Solution 2

Your key question was, does UUID_SHORT() create values that are unique within time and space as with UUID(). The short answer is yes, as long as you obey the special conditions MySQL requires.

The long answer is yes, but why would you want to use it? The only apparent downsides to UUID() is its representation is less storage-efficient (generates a 36-character string rather than a 64-bit integer), and can't be used with statement-based replication. But UUID() has the big upside of never having to think about the special conditions MySQL requires for UUID_SHORT(). If you're certain the conditions will never be a problem for you, and you're eager to save all of 224 bits per record, UUID_SHORT()is OK to use. But if you have any concerns about the special conditions, then it's probably best to avoid it.

The degree of concern you would have about the special conditions depends a lot on your operational environment. The requirement to never set the system clock backwards between mysqld restarts is a big concern for me. Servers are often configured to have their clocks auto-synched with some other time source (e.g. ntp in unix, Time Service in Windows), and if this behavior isn't carried out to your expectations, then you may not be able to guarantee that condition is met consistently.


Share:
12,941

Related videos on Youtube

jiraiya
Author by

jiraiya

database and software developer.

Updated on September 14, 2022

Comments

  • jiraiya
    jiraiya over 1 year

    A quick question or opinion if you will.

    I need to generate some UUID's for a database table.

    Auto incrementing keys won't cut it as I need the key to be unique across databases and systems also. UUID works fine however its output is too long for some of the systems that the rows will be exported to. UUID_SHORT() does the job fine and I have read MYSQL's conditions on guaranteeing its uniqueness.

    ButI just want to double check that if I use UUID_SHORT() to generate UUID's for rows from time to time that they will indeed be unique in time and space as with UUID().

    Cheers.

    • N.B.
      N.B. over 11 years
      If you create compound primary key such as PRIMARY KEY(id, server_id) and then change the default value of server_id to any integer number on the machines / systems in question - you can achieve uniqueness without using GUIDs and still retain auto_increments.