问题
I have a schema which is similar to
"id": "uuid",
"deviceId": "uuid",
"message": {
"content": "string",
"ts": 1
},
"data": {
"temperature": 21
}
I'd like to get the latest "data" (using message.ts as the timestamp) for each "deviceId".
So far, I've managed to get the data back, in order of timestamp using the query
SELECT c.deviceId, c.message.ts, c.data FROM c ORDER BY c.message.ts DESC
but I can't figure out how to remove the duplicate device records.
Is this possible to do within the CosmosDB SQL Engine?
回答1:
It is impossible to achieve this with one SQL by now.
May be this can be an alternative.
First, run this SQL SELECT c.deviceId,max(c.message.ts) as lastest FROM c group by c.deviceId
.
Then, you can get data by this SQL, SELECT * FROM c WHERE c.deviceId = 'xxx' AND c.message.ts = xxxx
回答2:
Thanks to Mark Brown's comment, I found the following which seems to be the correct solution to this problem. Not as elegant as just using some SQL for a one-off but is really what was needed.
https://docs.microsoft.com/en-us/samples/azure-samples/cosmosdb-materialized-views/real-time-view-cosomos-azure-functions/
In essence, you create a Serverless Function which is triggered by the Cosmos change feed and updates a materialized view, which is essentially just a document with (in this case) the most up to date data
per deviceId
.
Specifically for this case, it'll most likely update the corresponding device
document with it's most recent data.
来源:https://stackoverflow.com/questions/64712120/how-do-i-get-the-latest-record-for-each-item-in-cosmosdb-using-sql