I am doing now some data analyse tests and in the first, really simple I have got very strange results.
The idea is the following: from an internet access log (a col
Apparently using the group function on Aggregation Framework works well! :-)
The following Javascript code gets the 10 most visited domains with their visits in 17m17s!
db.NonFTP_Access_log.aggregate(
{ $group: {
_id: "$domain",
visits: { $sum: 1 }
}},
{ $sort: { visits: -1 } },
{ $limit: 10 }
).result.forEach(printjson);
Anyway I still don't understand why the MapReduce alternative is so slow. I have opened the following question in the MongoDB JIRA.