MongoDB — Find duplicate documents by multiple keys

浪子不回头ぞ 提交于 2020-08-25 04:48:01

问题


I have a collection with documents that look like the following:

{
        "_id" : ObjectId("55b377cb66b393427367c3e2"),
        "comment" : "This is a comment",
        "url_key" : "55b377cb66b393427367c3df", //This is an ObjectId from another record in a different collection
}

I need to find records in this collection that contain duplicate values for the both the comment AND the url_key.

I can easily generate (using aggregate) duplicate records for the same, single, key (eg: comment), but I can't figure out how to group by/aggregate for multiple keys.

Here's my current aggregation pipeline:

db.comments.aggregate([ { $group: { _id: { comment: "$comment" }, uniqueIds: { $addToSet: "$_id" }, count: { $sum: 1 } } }, { $match: { count: { $gte: 2 } } }, { $sort: { count : -1} }, {$limit 10 } ]);

回答1:


Is it as simple as grouping by multiple keys or did I misunderstand your question?

...
{ $group: { _id: { id: "$_id", comment: "$comment" }, count: { $sum: 1 } } },
{ $match: { count: { $gte: 2 } } },
...


来源:https://stackoverflow.com/questions/39490816/mongodb-find-duplicate-documents-by-multiple-keys

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!