UUID primary key for JPA Entity: safe approach to use unique values on multiple instances

10,026

Solution 1

I'm not a hibernate specialist, however, if you do it like this, in general, it should be kind of ok, which means that the probability of collision is low, however "what if" there is a collision? In other words, you can't be 100% sure that you're "guaranteed to avoid collision". If your system can deal with collision (re-create a UUID for example, assuming that performance penalty is negligible because this will happen in extremely rare cases) then no problem of course.

Now the real question is whether there is something else that can be done here? Well, Hibernate is able to generate the UUID in a way that also uses an IP of the machine as a parameter of generation:

@Entity
public class SampleEntity {
 
    @Id
    @GeneratedValue(generator = “UUID”)
    @GenericGenerator(
        name = “UUID”,
        strategy = “org.hibernate.id.UUIDGenerator”,
        parameters = {
            @Parameter(
                name = “uuid_gen_strategy_class”,
                value = “org.hibernate.id.uuid.CustomVersionOneStrategy”
            )
        }
    )
    @Column(name = “id”, updatable = false, nullable = false)
    private UUID id;
     
    …
}

For more information read here for example

Of course, the best approach would be to let the DB deal with the ID generation. You haven't specified which database do you use. For example, Postgresql allows generating UUID keys with the help of extension:

Read here for example.

In general, using the UUID is not always a good idea - it's hard to deal with them in day-to-day life, and in general they introduce an overhead that might be significant if there are many rows in the table. So you might consider using an auto-increment sequence or something for the primary key - DB will be able to do it and you won't need to bother.

Solution 2

UUID uuid = UUID.randomUUID()

My doubt is this: Is this approach safe? Can I be sure ids will always be unique?

Yes, extremely safe.

A UUID Version 4 has 122 bits of randomly generated data. That is a vast range of numbers. Assuming your UUID is being generated with a cryptographically-strong random number generator, you have no practical concerns with using a randomly-generated UUID.

For details, see the Collisions section on Wikipedia.

If you want to worry, apply your worry to things that are much more likely to happen. Top in my mind: Erroneously-flipped bits in non-EEC memory. (See valid rant by Linus Torvalds on the issue.)

Personally, I consider the point-in-space-and-time versions such as Version 1 to be even less of a concern for collisions. But others debate this. Either way, Version 1 or Version 4, I would sleep well.

Despite saying the above, you should still ensure that your code is written to be robust in the face of collisions. Not because of collisions from randomly-generated duplicates, but because of collision from the all-too-human possibilities such as a bug in your code that double-posts the record to database, or a DBA who mistakenly loads back-up data twice, and so on.

Share:
10,026
Safari
Author by

Safari

Updated on June 14, 2022

Comments

  • Safari
    Safari almost 2 years

    I'm using SpringBoot, JPA and Hibernate.

    I have a doubt.

    For my entities I need to have an UUID as primary key (and I would like to save this id in "clear mode" (string) and not binary)

    I'm using this code:

    @Id
    @Column(name = "id")
    @Type(type = "uuid-char")
    private UUID uuid = UUID.randomUUID();
    

    My doubt is this: Is this approach safe? Can I be sure ids will always be unique?

    I understand that, using this code, the UUID will be generated code side so, what's happen if I will have multiple instances for my service using the same DB service for all instances?

    Is it possible that more instances will generate the same UUID?

    • chrylis -cautiouslyoptimistic-
      chrylis -cautiouslyoptimistic- about 3 years
      Note that assigning this as a field default means that your application will be consuming a huge amount of randomness for no reason (as most object instances are not about to be persisted); using @GeneratedValue is nearly always preferable.