MongoDB - Aggregation Framework (Total Count)

后端未结

关注

 7  1861

When running a normal \"find\" query on MongoDB I can get the total result count (regardless of limit) by running \"count\" on the returned cursor. So, even if I limit to re

相关标签:

7条回答

独厮守ぢ

2020-12-31 08:19

There is a solution using push and slice: https://stackoverflow.com/a/39784851/4752635 (@emaniacs mentions it here as well).

But I prefer using 2 queries. Solution with pushing $$ROOT and using $slice runs into document memory limitation of 16MB for large collections. Also, for large collections two queries together seem to run faster than the one with $$ROOT pushing. You can run them in parallel as well, so you are limited only by the slower of the two queries (probably the one which sorts).

First for filtering and then grouping by ID to get number of filtered elements. Do not filter here, it is unnecessary.
Second query which filters, sorts and paginates.

I have settled with this solution using 2 queries and aggregation framework (note - I use node.js in this example):

var aggregation = [
  {
    // If you can match fields at the begining, match as many as early as possible.
    $match: {...}
  },
  {
    // Projection.
    $project: {...}
  },
  {
    // Some things you can match only after projection or grouping, so do it now.
    $match: {...}
  }
];


// Copy filtering elements from the pipeline - this is the same for both counting number of fileter elements and for pagination queries.
var aggregationPaginated = aggregation.slice(0);

// Count filtered elements.
aggregation.push(
  {
    $group: {
      _id: null,
      count: { $sum: 1 }
    }
  }
);

// Sort in pagination query.
aggregationPaginated.push(
  {
    $sort: sorting
  }
);

// Paginate.
aggregationPaginated.push(
  {
    $limit: skip + length
  },
  {
    $skip: skip
  }
);

// I use mongoose.

// Get total count.
model.count(function(errCount, totalCount) {
  // Count filtered.
  model.aggregate(aggregation)
  .allowDiskUse(true)
  .exec(
  function(errFind, documents) {
    if (errFind) {
      // Errors.
      res.status(503);
      return res.json({
        'success': false,
        'response': 'err_counting'
      });
    }
    else {
      // Number of filtered elements.
      var numFiltered = documents[0].count;

      // Filter, sort and pagiante.
      model.request.aggregate(aggregationPaginated)
      .allowDiskUse(true)
      .exec(
        function(errFindP, documentsP) {
          if (errFindP) {
            // Errors.
            res.status(503);
            return res.json({
              'success': false,
              'response': 'err_pagination'
            });
          }
          else {
            return res.json({
              'success': true,
              'recordsTotal': totalCount,
              'recordsFiltered': numFiltered,
              'response': documentsP
            });
          }
      });
    }
  });
});

0 讨论(0)

隐瞒了意图╮

2020-12-31 08:21

I get total count with aggregate().toArray().length

0 讨论(0)
发布评论:

提交评论
- 加载中...

感动是毒

2020-12-31 08:24

I got the same problem, and solved with $project, $slice and $$ROOT.

db.article.aggregate(
{ $group : {
    _id : '$author',
    posts : { $sum : 1 },
    articles: {$push: '$$ROOT'},
}},
{ $sort : { posts: -1 } },
{ $project: {total: '$posts', articles: {$slice: ['$articles', from, to]}},
).toArray(function(err, result){
    var articles = result[0].articles;
    var total = result[0].total;
});

You need to declare from and to variable.

https://docs.mongodb.com/manual/reference/operator/aggregation/slice/

0 讨论(0)

轻奢々

2020-12-31 08:36

$facets aggregation operation can be used for Mongo versions >= 3.4. This allows to fork at a particular stage of a pipeline in multiple sub-pipelines allowing in this case to build one sub pipeline to count the number of documents and another one for sorting, skipping, limiting.

This allows to avoid making same stages multiple times in multiple requests.

0 讨论(0)
发布评论:

提交评论
- 加载中...
面向向阳花

2020-12-31 08:36

in my case, we use $out stage to dump result set from aggeration into a temp/cache table, then count it. and, since we need to sort and paginate results, we add index on the temp table and save table name in session, remove the table on session closing/cache timeout.

0 讨论(0)
发布评论:

提交评论
- 加载中...
时光取名叫无心

2020-12-31 08:42

Assaf, there's going to be some enhancements to the aggregation framework in the near future that may allow you to do your calculations in one pass easily, but right now, it is best to perform your calculations by running two queries in parallel: one to aggregate the #posts for your top authors, and another aggregation to calculate the total posts for all authors. Also, note that if all you need to do is a count on documents, using the count function is a very efficient way of performing the calculation. MongoDB caches counts within btree indexes allowing for very quick counts on queries.

If these aggregations turn out to be slow there are a couple of strategies. First off, keep in mind that you want start the query with a $match if applicable to reduce the result set. $matches can also be speed up by indexes. Secondly, you can perform these calculations as pre-aggregations. Instead of possible running these aggregations every time a user accesses some part of your app, have the aggregations run periodically in the background and store the aggregations in a collection that contains pre-aggregated values. This way, your pages can simply query the pre-calculated values from this collection.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页