What are the trade-offs of dropping a MongoDB collection vs. removing all of its documents (assuming the collection will be re-created immediately)?
Removing and Dropping a collection is mostly implementation detail.
Removing a collection requires an one by one update of internal state that happen to exists in the collection.
Dropping a collection requires freeing up some large data structures inside the database of data files.
Dropping collection is vastly faster than removing one by one until collection is empty.
Meta data like indexes will exists if the collection is removed rather than when its dropped.
Source: MongDB University course
Once we have documents stored in our collection , we can remove all of the documents from it in two ways. Now choosing one over another is totally depends on your requirement.
1. Using drop():
By invoking drop()
on a collection , it will remove all the documents from it ,it will delete all the indexes on it and at the end it will delete the collection itself.
2.Using remove(): remove has two overloaded versions ,one in which we will pass the criteria to remove all the documents that are matching our passed criteria and 2nd one is default where we won’t pass any criteria (prior to 2.6) or pass an empty document (version 2.6 or more) and it will remove all the documents from the collection. Here, we are more interested in 2nd version when our intention is to clear all the documents from a collection.
Remark: To remove all documents from a collection, it may be more efficient to use the drop()
method to drop the entire collection, including the indexes, and then recreate the collection and rebuild the indexes.
A benefit of simply dropping a collection is that it is much faster than removing all of a collection's documents. If your collection will be "re-created immediately" anyway (assuming that includes index re-creation), then this is probably the most-attractive option.
The authors of the book MongoDB: The Definitive Guide (Kristina Chodorow and Michael Dirolf) ran an experiment where they provided a Python script which timed a drop
vs. a remove
of 1000000 records. The results came in at 0.01 seconds for the drop
and 46.08 seconds for the remove
. Now while the exact times may differ based-on hardware and other factors, it nonetheless illustrates the point that the drop
is significantly faster.
reference: Chodorow K., Dirolf M. (2010). “MongoDB: The Definitive Guide.” O'Reilly Media, Inc. Sebastapol, CA., pp.25
If you go through a remove all the documents from a collection, then you'll be doing a lot more work (freeing the document's storage, clearing the index entries that point to the document, and so on). If you instead just drop the collection, it'll just be reclaiming the extents that the collection and its indexes use.
One other difference is that dropping the collection will also remove the collection's indexes.