问题
I have a VS2017 solution in C#, and I'm using IDocumentClient.UpsertDocumentAsync to Upsert some documents into my cosmosdb documentdb collection. But I noticed it's actually creating new documents with the same id while there already is a document in the collection with that id.
Now after upserting a new document with the same id the query result looks something like this:
select * from c where c.id = "aaaa-bbbb-cccc"
[
{
"id": "aaaa-bbbb-cccc",
"firstname": "john",
"lastname": "doe"
},
{
"id": "aaaa-bbbb-cccc",
"firstname": "john",
"lastname": "doe",
"age": "35"
}
]
I'm quite confused with this behavior; maybe I don't properly understand the definition of "upsert". Would appreciate if anyone can clarify this for me.
回答1:
In Cosmos DB, the "id" is not a unique value. Instead it is the combination of partition key (in your case "age") and "id" that is unique. Cosmos DB allows the partition key to be absent (since it is lenient with schema) and treats missing partition key values as a special value (= "undefined"). Partition key values are also immutable - to change these, you need to delete/insert.
So in this case, you have two documents, one with [age="35", "aaaa-bbbb-cccc"]
and another with [age=undefined, "aaaa-bbbb-cccc"]
created by the upsert call. If you changed the upsert call to a replace, you would have received a NotFound error.
回答2:
I'm quite confused with this behavior; maybe I don't properly understand the definition of "upsert".
Upsert
operations in Azure Cosmos DB will create a document if it doesn't already exist otherwise overwrite it. “Should I use a Create or a Replace operation?”
,the database makes this decision for you. This saves you the additional request unit charges, and because the operation is atomic, it removes the possibility of a race condition.
However , cosmos db just ensures the uniqueness of id
per partition key
.
Sample:
[
{
"id": "1",
"name": "jay1"
},
{
"id": "1",
"name": "jay2"
},
{
"id": "1"
}
]
The partition-key is name
.Partition-key is used for sharding, it acts as a logical partition for your data, and provides Cosmos DB with a natural boundary for distributing data across partitions. Above 3 document have different partitions: "jay1"
,"jay2"
,None
so that id
attributes are unique in their own logical partitions. Now ,if you use the Upsert
method to add a document with the same id
to their respective partitions, you will override the previous document.
You could refer to the doc: Unique keys in Azure Cosmos DB and Partition and scale in Azure Cosmos DB.
来源:https://stackoverflow.com/questions/48776123/idocumentclient-upsertdocumentasync-doesnt-update-it-inserts-duplicated-ids