MongoDB and composite primary keys

前端 未结 4 1620
温柔的废话
温柔的废话 2020-12-07 21:55

I\'m trying to determine the best way to deal with a composite primary key in a mongo db. The main key for interacting with the data in this system is made up of 2 uuids. Th

相关标签:
4条回答
  • 2020-12-07 22:34

    I would've gone with option 2. You can still make an index that handles both the UUID fields, and performance should be the same as a compound primary key, except it'll be much easier to work with.

    Also, in my experience, I've never regretted giving something a unique ID, even if it wasn't strictly required. Perhaps that's an unpopular opinion though.

    0 讨论(0)
  • 2020-12-07 22:51

    I have an option 4 for you:

    Use the automatic _id field and add 2 single field indexes for both uuid's instead of a single composite index.

    1. The _id index would be sequential (although that's less important in MongoDB), easily shardable, and you can let MongoDB manage it.
    2. The 2 uuid indexes let you to make any kind of query you need (with the first one, with the second or with both in any order) and they take up less space than 1 compound index.
    3. In case you use both indexes (and other ones as well) in the same query MongoDB will intersect them (new in v2.6) as if you were using a compound index.
    0 讨论(0)
  • 2020-12-07 22:55

    I'd go for the 2 option and there is why

    1. Having two separate fields instead of the one concatenated from both uuids as suggested in 1st, will leave you the flexibility to create other combinations of indexes to support the future query requests or if turns out, that the cardinality of one key is higher then another.
    2. having non sequential keys could help you to avoid the hotspots while inserting in sharded environment, so its not such a bad option. Sharding is the best way, for my opinion, to scale inserts and updates on the collections, since the write locking is on database level (prior to 2.6) or collection level (2.6 version)
    0 讨论(0)
  • 2020-12-07 22:59

    You should go with option 1.

    The main reason is that you say you are worried about performance - using the _id index which is always there and already unique will allow you to save having to maintain a second unique index.

    For option 1, I'm worried about the insert performance do to having non sequential keys. I know this can kill traditional RDBMS systems and I've seen indications that this could be true in MongoDB as well.

    Your other options do not avoid this problem, they just shift it from the _id index to the secondary unique index - but now you have two indexes, once that's right-balanced and the other one that's random access.

    There is only one reason to question option 1 and that is if you plan to access the documents by just one or just the other UUID value. As long as you are always providing both values and (this part is very important) you always order them the same way in all your queries, then the _id index will be efficiently serving its full purpose.

    As an elaboration on why you have to make sure you always order the two UUID values the same way, when comparing subdocuments { a:1, b:2 } is not equal to { b:2, a:1 } - you could have a collection where two documents had those values for _id. So if you store _id with field a first, then you must always keep that order in all of your documents and queries.

    The other caution is that index on _id:1 will be usable for query:

    db.collection.find({_id:{a:1,b:2}}) 
    

    but it will not be usable for query

    db.collection.find({"_id.a":1, "_id.b":2})
    
    0 讨论(0)
提交回复
热议问题