Is it bad to change _id type in MongoDB to integer?

25,763

Solution 1

No it isn't bad at all and in fact the built in ObjectId is quite sizeable within the index so if you believe you have something better then you are more than welcome to change the default value of the _id field to whatever.

But, and this is a big but, there are some considerations when deciding to move away from the default formulated ObjectId, especially when using the auto incrementing _ids as shown here: https://docs.mongodb.com/v3.0/tutorial/create-an-auto-incrementing-field

Multi threading isn't such a big problem because findAndModify and the atomic locks can actually take care of that, but then you just hit into your first problem. findAndModify is not the fastest function nor the lightest and there have been significant performance drops noticed when using it regularly.

You also have to consider the overhead of doing this yourself anyway, even without findAndModify. For every insert you will need an extra query. Imagine having a unique id that you have to query the uniqueness of every time you want to insert. Eventually your insert rate will drop to a crawl and your lock time will build up.

Of course the ObjectId is really good at being unique without having to check or formulate its own uniqueness by touching the database prior to insertion, hence it doesn't have this overhead.

If you still feel an integer _id suites your scenario, then go for it, but bare in mind the overhead described above.

Solution 2

You can do it, but you are responsible to make sure that the integers are unique.

MongoDB doesn't support auto-increment fields like most SQL databases. When you have a distributed or multithreaded application which has multiple processes and/or threads which create new database entries, you have to make sure that they use the same counter. Otherwise it could happen that two threads try to store a document with the same _id in the database.

When that happens, one of them will fail. That means you have to wait for the database to return a success or error (by calling GetLastError or by setting the write concerns to acknowledged), which takes longer than just sending data in a fire-and-forget manner.

Solution 3

I had a use case for this: replacing _id with a 64 bit integer that represented a simhash of a document index for searching.

Since I intended to "Get or create", providing the initial simhash, and creating a new record if one didn't exist was perfect. Also, for anyone Googling, MongoDB support explained to me that simhashes are absolutely perfect for sharding and scaling, and even better than the more generic ObjectId, because they will divide up the data across shards perfectly and intrinsically, and you get the key stored for negative space (a uint64 is much smaller than an objectId and would need to be stored anyway).

Also, for you Googlers, replacing a MongoDB _id with something other than an objectId is absolutely simple: Just create an object with the _id being defined; use an integer if you like. That's it: Mongo will simply use it. If you try to create a document with the same _id you'll get an error (E11000/Duplicate key). So like me, if you're using simhashing, this is ideal in all respects.

Share:
25,763
just so
Author by

just so

Updated on August 23, 2020

Comments

  • just so
    just so over 3 years

    MongoDB uses ObjectId type for _id.

    Will it be bad if I make _id an incrementing integer?

    (With this gem, if you're interested)