I have a MongoDB aggregate pipeline that contains a number of steps (match on indexed fields, add fields, sort, collapse, sort again, page, project results.) If I comment ou
2019 ANSWER
This answer is for MongoDB 4.2
After reading the question and the discussion between you guys, I believe that the issue is resolved but still optimization is a common problem for all who are using MongoDB.
I faced the same problem, and here are the tips for query optimization.
Correct me if I'm wrong :)
1. Add index on collection
Indexes play a vital role in running queries quickly as Indexes are data structures that can store the collection’s data set in a form that is easy to traverse. Queries are efficiently executed with the help of indexes in MongoDB.
You can create a different type of indexes according to your need. Learn more about indexes here, the official MongoDB documentation.
2. Pipeline optimization
Always create an index on the foreignField attributes in a $lookup. Also, as lookup produces an array, we generally unwind it in next stage. So, instead of unwinding it in next stage unwind it inside the lookup like:
{
$lookup: {
from: "Collection",
as: "resultingArrays",
localField: "x",
foreignField: "y",
unwinding: { preserveNullAndEmptyArrays: false }
} }
Use allowDiskUse in aggregation, with the help of it aggregation operations can write data to the _tmp subdirectory in the Database Path directory. It is used to perform the large query on temp directory. For example:
db.orders.aggregate(
[
{ $match: { status: "A" } },
{ $group: { _id: "$uid", total: { $sum: 1 } } },
{ $sort: { total: -1 } }
],
{
allowDiskUse: true
},
)
3. Rebuild the indexes
If you are creating and deleting indexes quite often then rebuild your indexes. It helps MongoDB to refresh, the previously-stored query plan in, the cache, which keeps on taking over the required query plan, believe me, that issue sucks :(
4. Remove unwanted indexes
Too many indexes take too much time in Create, Update and Delete operation as they need to create index along with their tasks. So, remove them helps a lot.
5. Limiting Documents
In a real-world scenario, fetching complete data present in the database does not help. Also, either you can't display it or the user can't read complete fetched data. So, instead of fetching complete data, fetch data in chunks which helps both you and your client watching that data.
And lastly watching what execution plan is selected by MongoDB helps in figuring out the main issue. So, $explain will help you in figuring that out.
Hope this summary will help you guys, feel free to suggest new points if I missed any. I will add them too.