Query for latest version of a document by date in mongoDB

廉价感情. 提交于 2020-01-02 06:59:10

问题


I am trying to find a mongoDB script which will look at a collection where there are multiple records of the same document and only provide me with the latest version of each document as a result set.

I cannot explain it in English any better than above but maybe this little SQL below might explain it further. I want each document by transaction_reference but only the latest dated version (object_creation_date).

select 
    t.transaction_reference, 
    t.transaction_date, 
    t.object_creation_date,
    t.transaction_sale_value
from MyTable t
inner join (
    select 
        transaction_reference, 
        max(object_creation_date) as MaxDate
    from MyTable
    group by transaction_reference
) tm 
    on t.transaction_reference = tm.transaction_reference 
    and t.object_creation_date = tm.MaxDat

The reason why there are multiple versions of the same document is because I want to store each iteration of a transaction. The first time I receive a document, it may be in transaction_status of UNPAID then I receive the same transaction again and this time the transaction_status is PAID.

Some analysis will be to SUM all unique transactions whereas some other analysis may be to measure the time distance between a document with status UNPAID and the next of PAID.

As per request, here are two documents:

{
"_id": {
    "$oid": "579aa337f36d2808839a05e8"
},
"object_class": "Goods & Services Transaction",
"object_category": "Revenue",
"object_type": "Transaction",
"object_origin": "Sage One",
"object_origin_category": "Bookkeeping",
"object_creation_date": "2016-07-05T00:00:00.201Z",
"party_uuid": "dfa1e80a-5521-11e6-beb8-9e71128cae77",
"connection_uuid": "b945bd7c-7988-4d2a-92f5-8b50ab218e00",
"transaction_reference": "SI-1",
"transaction_status": "UNPAID",
"transaction_date": "2016-06-16T00:00:00.201Z",
"transaction_due_date": "2016-07-15T00:00:00.201Z",
"transaction_currency": "GBP",
"goods_and_services": [
    {
        "item_identifier": "PROD01",
        "item_name": "Product One",
        "item_quantity": 1,
        "item_gross_unit_sale_value": 1800,
        "item_revenue_category": "Sales Revenue",
        "item_net_unit_cost_value": null,
        "item_net_unit_sale_value": 1500,
        "item_unit_tax_value": 300,
        "item_net_total_sale_value": 1500,
        "item_gross_total_sale_value": 1800,
        "item_tax_value": 300
    }
],
"transaction_gross_value": 1800,
"transaction_gross_curr_value": 1800,
"transaction_net_value": 1500,
"transaction_cost_value": null,
"transaction_payments_value": null,
"transaction_payment_extras_value": null,
"transaction_tax_value": 300,
"party": {
    "customer": {
        "customer_identifier": "11",
        "customer_name": "KP"
    }
}
}

and second version where it is paid now

{
"_id": {
    "$oid": "579aa387f36d2808839a05ee"
},
"object_class": "Goods & Services Transaction",
"object_category": "Revenue",
"object_type": "Transaction",
"object_origin": "Sage One",
"object_origin_category": "Bookkeeping",
"object_creation_date": "2016-07-16T00:00:00.201Z",
"party_uuid": "dfa1e80a-5521-11e6-beb8-9e71128cae77",
"connection_uuid": "b945bd7c-7988-4d2a-92f5-8b50ab218e00",
"transaction_reference": "SI-1",
"transaction_status": "PAID",
"transaction_date": "2016-06-16T00:00:00.201Z",
"transaction_due_date": "2016-07-15T00:00:00.201Z",
"transaction_currency": "GBP",
"goods_and_services": [
    {
        "item_identifier": "PROD01",
        "item_name": "Product One",
        "item_quantity": 1,
        "item_gross_unit_sale_value": 1800,
        "item_revenue_category": "Sales Revenue",
        "item_net_unit_cost_value": null,
        "item_net_unit_sale_value": 1500,
        "item_unit_tax_value": 300,
        "item_net_total_sale_value": 1500,
        "item_gross_total_sale_value": 1800,
        "item_tax_value": 300
    }
],
"transaction_gross_value": 1800,
"transaction_gross_curr_value": 1800,
"transaction_net_value": 1500,
"transaction_cost_value": null,
"transaction_payments_value": null,
"transaction_payment_extras_value": null,
"transaction_tax_value": 300,
"party": {
    "customer": {
        "customer_identifier": "11",
        "customer_name": "KP"
    }
}
}

Thanks for your support, Matt


回答1:


If I understand the question correctly you could use something like this

db.getCollection('yourTransactionsCollection').aggregate([
    {
        $sort: {
            "transaction_reference": 1,
            "object_creation_date": -1
        }
    },
    {
        $group: {
            _id: "$transaction_reference",
            "transaction_date": { $first: "$transaction_date" },
            "object_creation_date": { $first: "$transaction_date" },
            "transaction_sale_value": { $first: "$transaction_sale_value" }
        }
    }
])

which outputs a result like the following

{
    "_id" : "SI-1",
    "transaction_date" : "2016-06-16T00:00:00.201Z",
    "object_creation_date" : "2016-06-16T00:00:00.201Z",
    "transaction_sale_value" : null
}

Note that you can change the $sort to just include the object_creation_date but I included both transaction_reference and object_creation_date as I think it would make sense to create a composite index on both of them instead of just the creation date. Adjust that according to your indexes so that the $sort will hit one.
In addition there was no document field transaction_sale_value hence the null for it in the result. Maybe you missed that or it is just not in your sample documents but I think you get the idea and can adjust it to your needs.



来源:https://stackoverflow.com/questions/38650002/query-for-latest-version-of-a-document-by-date-in-mongodb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!