How to create a worldwide unique GUID/UUID system for Mongo with Python?

10,200

Solution 1

If you want a unique id, and don't want to use ObjectId, you probably want to use uuid4:

>>> import pymongo
>>> import uuid
>>> c = pymongo.Connection()
>>> uu = uuid.uuid4()
>>> uu
UUID('14a2aad7-fa01-40a4-8a80-04242b946ee4')
>>> c.test.uuidtest.insert({'_id': uu})
UUID('14a2aad7-fa01-40a4-8a80-04242b946ee4')
>>> c.test.uuidtest.find_one()
{u'_id': UUID('14a2aad7-fa01-40a4-8a80-04242b946ee4')}

Solution 2

import uuid
uuid.uuid1()

Source: http://docs.python.org/library/uuid.html

Share:
10,200
zakdances
Author by

zakdances

Updated on June 04, 2022

Comments

  • zakdances
    zakdances almost 2 years

    In the Mongo docs, it states the following:

    The _id field can be of any type; however, it must be unique. Thus you can use UUIDs in the _id field instead of BSON ObjectIds (BSON ObjectIds are slightly smaller; they need not be worldwide unique, just unique for a single db cluster). When using UUIDs, your application must generate the UUID itself. Ideally the UUID is then stored in the [DOCS:BSON] type for efficiency – however you can also insert it as a hex string if you know space and speed will not be an issue for the use case.

    So that being the case, can someone walk me through how I can create a bullet-proof, worldwide unique GUID in [DOCS:BSON] format for all my Mongo documents? I want to make sure that at no point will I have duplicate GUIDs, even across clusters. Does anyone have any experience with or ideas for best practices when it comes to Mongo and GUIDs? Would it be easier to use Mongos native ID system, but check for duplicates before inserting and generating a new ObjectID if need be?

  • Bernie Hackett
    Bernie Hackett over 11 years
    PyMongo will automatically serialize UUIDs for you. You don't have to do anything special with them.
  • zakdances
    zakdances over 11 years
    How does PyMongo know it's a UUID and not just a regular string?
  • zakdances
    zakdances over 11 years
    This is really cool, but won't doing this just replace the ObjectID with a string (the UUID)? How does pymongo know that the UUID is a UUID and not just a regular string? Is there a way to preserve the other properties specific to ObjectID such as generation_time?
  • zakdances
    zakdances over 11 years
    Also, isn't uuid1 preferable to uuid4 in this context?
  • Bernie Hackett
    Bernie Hackett over 11 years
    PyMongo stores the UUID as BSON Binary with a specific subtype (3 or 4, see the docs in bson.binary for the details). MongoDB knows how to handle queries against these subtypes (and a few more). When PyMongo decodes the document it sees that the subtype for this binary blob is 3 or 4 and creates a UUID instance.
  • Bernie Hackett
    Bernie Hackett over 11 years
    I'm not totally sure if uuid1 is more desirable. I know that uuid1 uses the machine's network address which may bother some people from a privacy perspective.
  • Mischa Arefiev
    Mischa Arefiev about 10 years
    Because you feed it an UUID instance.