I need help with mongo in this problem: I have collection stats (UserId, EventId, Count, Date) in collection are data
It is easier, and much faster, to do the job with an aggregate()
!
We will use a $project
to create a counter field for each event, filling in the count from the document, if the event matches, zero otherwise. Then we will $group
by user-id, summing up all the event counters.
For the sake of explanation, let me first show how this looks like hard-coded for the two different events (1 and 2) in your example:
db.xx.aggregate([
{ $project: { userid:1,
cnt_e1: { $cond: [ { $eq: [ "$event", 1 ] }, "$count", 0 ] },
cnt_e2: { $cond: [ { $eq: [ "$event", 2 ] }, "$count", 0 ] },
} },
{ $group: { _id: "$userid", cnt_e1: { $sum: "$cnt_e1" }, cnt_e2: { $sum: "$cnt_e2" } } },
{ $sort: { _id: 1 } },
])
For the given collection:
> db.xx.find({},{_id:0})
{ "userid" : 1, "event" : 1, "count" : 10 }
{ "userid" : 1, "event" : 1, "count" : 15 }
{ "userid" : 1, "event" : 2, "count" : 12 }
{ "userid" : 2, "event" : 1, "count" : 5 }
{ "userid" : 3, "event" : 2, "count" : 10 }
the result is:
{
"result" : [
{
"_id" : 1,
"cnt_e1" : 25,
"cnt_e2" : 12
},
{
"_id" : 2,
"cnt_e1" : 5,
"cnt_e2" : 0
},
{
"_id" : 3,
"cnt_e1" : 0,
"cnt_e2" : 10
}
],
"ok" : 1
}
To get this done for variable events, we'll have to generate the projection and the grouping. We'll get an array of all possible events using the distinct()
command (you might want to define an index on "event"). Then we create the two statements as JSON objects by looping over the array:
project = {};
project.$project = {};
project.$project.userid = 1;
group = {};
group.$group = {};
group.$group._id = '$userid'
events = db.xx.distinct( "event" );
events.forEach( function( e ) {
field = "cnt_e" + e;
eval("project.$project." + field + " = {}");
eval("project.$project." + field + ".$cond = []");
eval("project.$project." + field + ".$cond[0] = {}");
eval("project.$project." + field + ".$cond[0].$eq = []");
eval("project.$project." + field + ".$cond[0].$eq[0] = '$event'");
eval("project.$project." + field + ".$cond[0].$eq[1] = " + e );
eval("project.$project." + field + ".$cond[1] = '$count'");
eval("project.$project." + field + ".$cond[2] = 0");
eval("group.$group." + field + " = {}");
eval("group.$group." + field + ".$sum = '$" + field + "'");
});
//printjson(project);
//printjson(group);
db.xx.aggregate([
project,
group,
{ $sort: { _id: 1 } },
])
And the result is the same as above.
Note: the above works for numerical events. If they were strings, you'd have to adapt the generator.
At first sight, this might look more complicated than @Philipp 's mapReduce. But that will not return all events for each user - only the ones that do have a count. For a complete vertical to horizontal mapping you would have to generate the map and the reduce functions as well.
For more information on aggregate(), see http://docs.mongodb.org/manual/aggregation/