Inserting and selecting UUIDs as binary(16)

103,730

Solution 1

So, as a response to comments. The correct way to store a 36-char UUID as binary(16) is to perform the insert in a manner like:

INSERT INTO sometable (UUID) VALUES
       (UNHEX(REPLACE("3f06af63-a93c-11e4-9797-00505690773f", "-","")))

UNHEX because an UUID is already a hexed value. We trim (REPLACE) the dashes in the statement to bring the length down to 32 characters (our 16 bytes represented as HEX). You can do this at any point before storing it, obviously, so it doesn't have to be handled by the database.

You may retrieve the UUID like this:

SELECT HEX(UUID) FROM sometable;

Just in case someone comes across this thread and is unsure how this works.

And remember: If you're selecting a row using the UUID, use UNHEX() on the condition:

SELECT * FROM sometable WHERE UUID = UNHEX('3f06af63a93c11e4979700505690773f');

or literal notation (as mentioned by Alexis Wilke):

SELECT * FROM sometable WHERE UUID = 0x3f06af63a93c11e4979700505690773f;

And NOT HEX()on the column:

SELECT * FROM sometable WHERE HEX(UUID) = '3f06af63a93c11e4979700505690773f';

The last solution, while it works, requires that MySQL HEXes all UUIDs before it can determine which rows match. It's very inefficient.

Edit: If you're using MySQL 8 you should have a look at the UUID functions as mentioned in SlyDave's answer. This answer is still correct, but it doesn't optimise the UUID indexes which can be done natively using those functions. If you're on < MySQL 8 you can implement Devon's polyfill, which provides identical functionality on previous versions of MySQL.

Solution 2

As of MySQL 8 you can use two new UUID functions:

  • BIN_TO_UUID

    SELECT BIN_TO_UUID(uuid, true) AS uuid FROM foo;
    -- 3f06af63-a93c-11e4-9797-00505690773f
    
  • UUID_TO_BIN

    INSERT INTO foo (uuid) VALUES (UUID_TO_BIN('3f06af63-a93c-11e4-9797-00505690773f', true));
    

This method also supports rearranging the time component of the uuid to enhance indexing performance (by ordering it chronologically), simply set the second argument to true - this only works for UUID1.

If you are using the true on UUID_TO_BIN flag for indexing performance (recommended), you must also set it on BIN_TO_UUID otherwise it won't convert back properly.

See the documentation for further details.

Solution 3

Polyfill for BIN_TO_UUID and UUID_TO_BIN for MySQL 5 with the swap_flag parameter.

DELIMITER $$

CREATE FUNCTION BIN_TO_UUID(b BINARY(16), f BOOLEAN)
RETURNS CHAR(36)
DETERMINISTIC
BEGIN
   DECLARE hexStr CHAR(32);
   SET hexStr = HEX(b);
   RETURN LOWER(CONCAT(
        IF(f,SUBSTR(hexStr, 9, 8),SUBSTR(hexStr, 1, 8)), '-',
        IF(f,SUBSTR(hexStr, 5, 4),SUBSTR(hexStr, 9, 4)), '-',
        IF(f,SUBSTR(hexStr, 1, 4),SUBSTR(hexStr, 13, 4)), '-',
        SUBSTR(hexStr, 17, 4), '-',
        SUBSTR(hexStr, 21)
    ));
END$$


CREATE FUNCTION UUID_TO_BIN(uuid CHAR(36), f BOOLEAN)
RETURNS BINARY(16)
DETERMINISTIC
BEGIN
  RETURN UNHEX(CONCAT(
  IF(f,SUBSTRING(uuid, 15, 4),SUBSTRING(uuid, 1, 8)),
  SUBSTRING(uuid, 10, 4),
  IF(f,SUBSTRING(uuid, 1, 8),SUBSTRING(uuid, 15, 4)),
  SUBSTRING(uuid, 20, 4),
  SUBSTRING(uuid, 25))
  );
END$$

DELIMITER ;

--
-- Tests to demonstrate that it works correctly. These are the values taken from
-- https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin
--
-- If you run these SELECTs using the above functions, the 
-- output of the two columns should be exactly identical in all four cases.
SET @uuid = '6ccd780c-baba-1026-9564-5b8c656024db';
SELECT HEX(UUID_TO_BIN(@uuid, 0)), '6CCD780CBABA102695645B8C656024DB';
SELECT HEX(UUID_TO_BIN(@uuid, 1)), '1026BABA6CCD780C95645B8C656024DB';
SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,0),0), '6ccd780c-baba-1026-9564-5b8c656024db';
SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,1),1), '6ccd780c-baba-1026-9564-5b8c656024db';

Included are the SELECT samples from https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin that demonstrate that the above code returns the exact same results as the 8.0 function. These functions are considered DETERMINISTIC as they always produce the same output for a given input. See https://dev.mysql.com/doc/refman/8.0/en/create-procedure.html

Solution 4

I am using MariaDB so BIN_TO_UUID functions family do not exist. I managed to get the corresponding values anyway.

bin -> hex

Here, uuid is the binary(16) value of an uuid; you'll use the value below to SELECT a readable version of it.

LOWER(CONCAT(
    SUBSTR(HEX(uuid), 1, 8), '-',
    SUBSTR(HEX(uuid), 9, 4), '-',
    SUBSTR(HEX(uuid), 13, 4), '-',
    SUBSTR(HEX(uuid), 17, 4), '-',
    SUBSTR(HEX(uuid), 21)
))

hex -> bin

Here, cc6e6d97-5501-11e7-b2cb-ceedca613421 is a readable version of an UUID, and you'll use the value below in a WHERE clause to seek for it.

UNHEX(REPLACE('cc6e6d97-5501-11e7-b2cb-ceedca613421', '-', ''))

Cheers

Solution 5

The other answers are correct. The UUID() function returns a 36 character string and that needs to be converted using the shown functions (UNHEX() or, on newer platforms, UUID_TO_BIN()).

However, if you use your own software to create your UUIDs, then you can use the Hexadecimal Literal notation instead.

So I would use the following with the MySQL UUID() function:

INSERT INTO sometable (id) VALUES (UNHEX(REPLACE(UUID(), '-', '')));  -- all versions
INSERT INTO sometable (id) VALUES (UUID_TO_BIN(UUID());               -- since v8.0

But use this in case I generate my own UUIDs;

INSERT INTO sometable (id) VALUES 0x3f06af63a93c11e4979700505690773f;

Similarly, you can use Hexadecimal Literals in your WHERE clauses:

SELECT * FROM sometable WHERE id = 0x3f06af63a93c11e4979700505690773f;

This will be faster if you do not have to convert your data to a UUID string each time.

Note: the 'x' in '0xaBc is case sensitive. The hexadecimal digits are not, however.

Share:
103,730
nickdnk
Author by

nickdnk

Updated on July 22, 2022

Comments

  • nickdnk
    nickdnk over 1 year

    I don't understand why

    SELECT UUID();
    

    Returns something like:

    3f06af63-a93c-11e4-9797-00505690773f
    

    But if I insert it into a binary(16) field (the UUID() function) with for instance a BEFORE INSERT trigger and run a select, it returns something like:

    0782ef48-a439-11
    

    Note that these two UUIDs are not the same data.

    I realize binary and an UUID string doesn't look identical, but shouldn't the selected data at least be just as long? Otherwise how can it possibly be equally likely to be unique?

    Is it better to store it as char(36)? I just need it to be unique to prevent duplicate inserts. It is never selected or used for joins.

    EDIT:

    before trigger would be like:

    BEGIN
    
    if NEW.UUID IS NULL THEN
    
    NEW.UUID = UUID();
    
    END IF
    
    END