Using UUIDs instead of ObjectIDs in MongoDB

后端 未结 5 417
误落风尘
误落风尘 2021-01-30 03:32

We are migrating a database from MySQL to MongoDB for performance reasons and considering what to use for IDs of the MongoDB documents. We are debating between using ObjectIDs,

5条回答
  •  北海茫月
    2021-01-30 04:34

    We must be careful to distinguish the cost of MongoDB inserting a thing vs. the cost to generate the thing in the first place plus that cost relative to the size of the payload. Below is a little matrix that shows method of generating the _id crossed against the size of an optional extra bytes worth of payload. Tests are using javascript only, conducted on MacBook Pro localhost for 100,000 inserts using insertMany of batches of 100 without transactions to try to remove network, chatty, and other factors. Two runs with batch = 1 were also done just to highlight the dramatic difference.

    
    Method                                                                                         
    A  :  Simple int:          _id:0, _id:1, ...                                                   
    B  :  ObjectId             _id:ObjectId("5e0e6a804888946fa61a1976"), ...                       
    C  :  Simple string:       _id:"A0", _id:"A1", ...                                             
    
    D  :  UUID length string   _id:"9575edcc-cb70-4d63-97ed-ee5d624de87b0", ...                    
          (but not actually                                                                        
          generated by UUID()                                                                      
    
    E  :  Real generated UUID  _id: UUID("35992974-21ea-4f61-b715-2dfaed663b73"), ...              
          (stored UUID() object)                                                                   
    
    F  :  Real generated UUID  _id: "6b16f733-ff24-4172-83f9-e4f96ace6775"                         
          (stored as string, e.g.                                                                  
          UUID().toString().substr(6,36)                                                           
    
    Time in milliseconds to perform 100,000 inserts on fresh (empty) collection.
    
    Extra                M E T H O D   (Batch = 100)                                                               
    Payload   A     B     C     D     E     F       % drop A to F                                  
    --------  ----  ----  ----  ----  ----  ----    ------------                                   
    None      2379  2386  2418  2492  3472  4267    80%                                            
    512       2934  2928  3048  3128  4151  4870    66%                                            
    1024      3249  3309  3375  3390  4847  5237    61%                                            
    2048      3953  3832  3987  4342  5448  5888    49% 
    4096      6299  6343  6199  6449  7634  8640    37%                                            
    8192      9716  9292  9397 10816 11212 11321    16% 
    
    Extra              M E T H O D   (Batch = 1)                                          
    Payload   A      B      C      D      E      F       % drop A to F              
    --------  -----  -----  -----  -----  -----  -----                              
    None      48006  48419  49136  48757  50649  51280   6.8%                       
    1024      50986  50894  49383  49373  51200  51821   1.2%                       
    
    
    

    This was a quicky test but it seems clear that basic strings and ints as _id are roughly the same speed but actually generating a UUID adds time -- especially if you take the string version of the UUID() object, e.g. UUID().toString().substr(6,36) It is also worth noting that constructing an ObjectId appears to be as quick.

提交回复
热议问题