So i\'m new with mongodb and mapreduce in general and came across this "quirk" (or atleast in my mind a quirk)
Say I have objects in my collection like so:
"MongoDB will not call the reduce function for a key that has only a single value. The values argument is an array whose elements are the value objects that are “mapped” to the key."
http://docs.mongodb.org/manual/reference/command/mapReduce/#mapreduce-reduce-cmd
I realize this is an older question, but I came to it and felt like I still didn't understand why this behavior exists and how to build map/reduce functionality so it's a non-issue.
The reason MongoDB doesn't call the reduce function if there is a single instance of a key is because it isn't necessary (I hope this will make more sense in a moment). The following are requirements for reduce functions:
- The reduce function must return an object whose type must be identical to the type of the value emitted by the map function.
- The order of the elements in the valuesArray should not affect the output of the reduce function
- The reduce function must be idempotent.
The first requirement is very important and it seems a number of people are overlooking it because I've seen a number of people mapping in the reduce function then dealing with the single-key case in the finalize function. This is the wrong way to address the issue, however.
Think about it like this: If there's only a single instance of a key, a simple optimization is to skip the reducer entirely (there's nothing to reduce). Single-key values are still included in the output, but the intent of the reducer is to build an aggregate result of the multi-key documents in your collection. If the mapper and reducer are outputting the same type, you should be blissfully unaware by looking at the object structure of the output from your map/reduce functions. You shouldn't have to use a finalize function to correct the structure of your objects that didn't run through the reducer.
In short, do your mapping in your map function and reduce multi-key values into a single, aggregate result in your reduce functions.
Solution:
in finalize make checking for this field and make required actions
$map = new MongoCode("function() {
var value = {
time: this.time,
email_id: this.email_id,
single: 0
};
emit(this.email, value);
}");
$reduce = new MongoCode("function(k, vals) {
// make some need actions here
return {
time: vals[0].time,
email_id: vals[0].email_id,
single: 1
};
}");
$finalize = new MongoCode("function(key, reducedVal) {
if (reducedVal.single == 0) {
reducedVal.time = 11111;
}
return reducedVal;
};");
Is this natural behavior of mapreduce in general?
Yes.
The reduce function combines documents with the same key into one document. If the map function emits a single document for a particular key (as is the case with key 3), the reduce function will not be called.