问题
I have a collection of documents in MongoDB that looks like:
{"_id": 1, "array": [{"id": 1, "content": "..."}, {"id": 2, "content": "..."}]}
{"_id": 2, "array": [{"id": 1, "content": "..."}, {"id": 2, "content": "..."}, {"a_id": 3, "content": "..."}]}
and I want to ensure that there is no duplicate array.id
within each document. So the provided example is ok, but the followign is not:
{"_id": 1, "array": [{"id": 1, "content": "..."}, {"id": 1, "content": "..."}]}
My question is how to do this (preferably in PyMongo
).
EDIT
What I tried was the following code that I thought would create key on (_id, array.id)
but if you run it this does not happen:
from pymongo import MongoClient, ASCENDING
client = MongoClient(host="localhost", port=27017)
database = client["test_db"]
collection = database["test_collection"]
collection.drop()
collection.create_index(keys=[("_id", ASCENDING),
("array.id", ASCENDING)],
unique=True,
name="new_key")
document = {"array": [{"id": 1}, {"id": 2}]}
collection.insert_one(document)
collection.find_one_and_update({"_id": document["_id"]},
{"$push": {"array": {"id": 1}}})
updated_document = collection.find_one({"_id": document["_id"]})
print(updated_document)
which outputs (note that there are two objects with id = 1
in the array
). I would expect to get an exception.
{'_id': ObjectId('5eb51270d6d70fbba739e3b2'), 'array': [{'id': 1}, {'id': 2}, {'id': 1}]}
回答1:
So if I understand it correctly there is no way how to set index (or some condition) that would enforce the uniqueness within the document, right? (Other than check this explicitly when creating the document or when inserting into it.)
Yes. Please see the following two scenarios about using the unique index on an array field with embedded documents.
Unique Multikey Index (index on embdeed document field within an array):
For unique indexes, the unique constraint applies across separate documents in the collection rather than within a single document.
Because the unique constraint applies to separate documents, for a unique multikey index, a document may have array elements that result in repeating index key values as long as the index key values for that document do not duplicate those of another document.
First Scenario:
db.arrays.createIndex( { _id: 1, "array.id": 1}, { unique: true } )
db.arrays.insertOne( { "_id": 1, "array": [ { "id": 1, "content": "11"}, { "id": 2, "content": "22"} ] } )
db.arrays.insertOne( { "_id": 2, "array": [ { "id": 1, "content": "1100"}, { "id": 5, "content": "55"} ] } )
db.arrays.insertOne( {"_id": 3, "array": [ {"id": 3, "content": "33"}, {"id": 3, "content": "3300"} ] } )
All the three documents gets inserted without any errors.
As per the note on Unique Multikey Index, above, the document with _id : 3
has two embedded documents within the array with same "array.id"
value: 3
.
Also, the uniqueness is enforced on two keys of the compound index { _id: 1, "array.id": 1}
and there were duplicate "array.id"
values across the documents also ( the _id
values 1
and 2
).
Second Scenario:
db.arrays2.createIndex( { "array.id": 1 }, { unique: true } )
db.arrays2.insertOne( { "_id": 3, "array": [ { "id": 3, "content": "33" }, { "id": 3, "content": "330"} ] } )
db.arrays2.insertOne( { "_id": 4, "array": [ { "id": 3, "content": "331" }, { "id": 30, "content": "3300" } ] } )
The first document with _id : 3
gets inserted successfully. The second one has an error: "errmsg" : "E11000 duplicate key error collection: test.arrays2 index: array.id_1 dup key: { array.id: 3.0 } "
. This behavior is as expected as per the note Unique Multikey Index.
来源:https://stackoverflow.com/questions/61655391/how-to-set-unique-constraint-for-field-in-document-nested-in-array