How flexible is the aggregate function for output formatting in MongoDB?
Data format:
{
\"_id\" : ObjectId(\"506ffffd1900a47d802702a904\"),
MongoDB 2.6 made this a lot easier by introducing $map, which allows a simplier form of array transposition:
db.metrics.aggregate([
{ "$match": {
"array_serial": array,
"port_name": { "$in": ports},
"datetime": { "$gte": from, "$lte": to }
}},
{ "$group": {
"_id": "$port_name",
"data": {
"$push": {
"$map": {
"input": [0,1],
"as": "index",
"in": {
"$cond": [
{ "$eq": [ "$$index", 0 ] },
"$datetime",
"$metric"
]
}
}
}
},
"count": { "$sum": 1 }
}}
])
Where much like the approach with $unwind
, you supply an array as "input" to the map operation consisting of two values and then essentially replace those values with the field values you want via the $cond
operation.
This actually removes all the pipeline juggling required to transform the document as was required in previous releases and just leaves the actual aggregation to the job at hand, which is basically accumulating per "port_name" value, and the transformation to array is no longer a problem area.
Building arrays in the aggregation framework without $push and $addToSet is something that seems to be lacking. I've tried to get this to work before, and failed. It would be awesome if you could just do:
data : {$push: [$datetime, $metric]}
in the $group
, but that doesn't work.
Also, building "literal" objects like this doesn't work:
data : {$push: {literal:[$datetime, $metric]}}
or even data : {$push: {literal:$datetime}}
I hope they eventually come up with some better ways of massaging this sort of data.
Combining two fields into an array of values with the Aggregation Framework is possible, but definitely isn't as straightforward as it could be (at least as at MongoDB 2.2.0).
Here is an example:
db.metrics.aggregate(
// Find matching documents first (can take advantage of index)
{ $match : {
'array_serial' : array,
'port_name' : { $in : ports},
'datetime' : { $gte : from, $lte : to}
}},
// Project desired fields and add an extra $index for # of array elements
{ $project: {
port_name: 1,
datetime: 1,
metric: 1,
index: { $const:[0,1] }
}},
// Split into document stream based on $index
{ $unwind: '$index' },
// Re-group data using conditional to create array [$datetime, $metric]
{ $group: {
_id: { id: '$_id', port_name: '$port_name' },
data: {
$push: { $cond:[ {$eq:['$index', 0]}, '$datetime', '$metric'] }
},
}},
// Sort results
{ $sort: { _id:1 } },
// Final group by port_name with data array and count
{ $group: {
_id: '$_id.port_name',
data: { $push: '$data' },
count: { $sum: 1 }
}}
)