We are migrating a database from MySQL to MongoDB for performance reasons and considering what to use for IDs of the MongoDB documents. We are debating between using ObjectIDs,
We must be careful to distinguish the cost of MongoDB inserting a thing vs. the cost to generate the thing in the first place plus that cost relative to the size of the payload. Below is a little matrix that shows method of generating the _id
crossed against the size of an optional extra bytes worth of payload. Tests are using javascript only, conducted on MacBook Pro localhost for 100,000 inserts using insertMany
of batches of 100 without transactions to try to remove network, chatty, and other factors. Two runs with batch = 1 were also done just to highlight the dramatic difference.
Method
A : Simple int: _id:0, _id:1, ...
B : ObjectId _id:ObjectId("5e0e6a804888946fa61a1976"), ...
C : Simple string: _id:"A0", _id:"A1", ...
D : UUID length string _id:"9575edcc-cb70-4d63-97ed-ee5d624de87b0", ...
(but not actually
generated by UUID()
E : Real generated UUID _id: UUID("35992974-21ea-4f61-b715-2dfaed663b73"), ...
(stored UUID() object)
F : Real generated UUID _id: "6b16f733-ff24-4172-83f9-e4f96ace6775"
(stored as string, e.g.
UUID().toString().substr(6,36)
Time in milliseconds to perform 100,000 inserts on fresh (empty) collection.
Extra M E T H O D (Batch = 100)
Payload A B C D E F % drop A to F
-------- ---- ---- ---- ---- ---- ---- ------------
None 2379 2386 2418 2492 3472 4267 80%
512 2934 2928 3048 3128 4151 4870 66%
1024 3249 3309 3375 3390 4847 5237 61%
2048 3953 3832 3987 4342 5448 5888 49%
4096 6299 6343 6199 6449 7634 8640 37%
8192 9716 9292 9397 10816 11212 11321 16%
Extra M E T H O D (Batch = 1)
Payload A B C D E F % drop A to F
-------- ----- ----- ----- ----- ----- -----
None 48006 48419 49136 48757 50649 51280 6.8%
1024 50986 50894 49383 49373 51200 51821 1.2%
This was a quicky test but it seems clear that basic strings and ints as _id
are roughly the same speed but actually generating a UUID adds time -- especially if you take the string version of the UUID()
object, e.g. UUID().toString().substr(6,36)
It is also worth noting that constructing an ObjectId
appears to be as quick.