For what would be this query in SQL (to find duplicates):
SELECT userId, name FROM col GROUP BY userId, name HAVING COUNT(*)>1
I performed this simple query in MongoDB:
res = db.col.group({key:{userId:true,name:true},
reduce: function(obj,prev) {prev.count++;},
initial: {count:0}})
I've added a simple Javascript loop to go over the result set, and performed a filter to find all the fields with a count > 1 there, like so:
for (i in res) {if (res[i].count>1) printjson(res[i])};
Is there a better way to do this other than using javascript code in the client? If this is the best/simplest way, say that it is, and this question will help someone :)
New answer using Mongo aggregation framework
After this question was asked and answered, 10gen released Mongodb version 2.2 with an aggregation framework. The new best way to do this query is:
db.col.aggregate( [
{ $group: { _id: { userId: "$userId", name: "$name" },
count: { $sum: 1 } } },
{ $match: { count: { $gt: 1 } } },
{ $project: { _id: 0,
userId: "$_id.userId",
name: "$_id.name",
count: 1}}
] )
10gen has a handy SQL to Mongo Aggregation conversion chart worth bookmarking.
来源:https://stackoverflow.com/questions/5550253/what-is-the-correct-way-to-do-a-having-in-a-mongodb-group-by