When should I call ensureIndex? Before inserting a single record, after inserting a single record, or before calling find()?
Regards,
Johnny
You only need to do this once. Example:
db.table.insert({foo: 'bar'});
var foo = db.table.findOne({foo: 'bar'}); // => delivered from FS, not RAM
db.table.ensureIndex({foo: 1});
var foo = db.table.findOne({foo: 'bar'}); // => delivered from RAM, not FS
db.table.insert({foo: 'foo'});
var foo = db.table.findOne({foo: 'foo'}); // => delivered from RAM, not FS
It doesn't matter, but you only have to do this once. If you want to batch insert a large amount of data to an empty collection then it is best to create the index after the inserts but otherwise it doesn't really matter.
If you have a collection that have millions of records and you are building multiple compound indices with auto-indexing turned off then you MUST ensure that you are invoking ensureIndexes() much before your first find query, possibly synchronously i.e. after ensureIndexes method returns.
The mode(foreground vs background) in which indexes are build adds extra complexity. Foreground mode locks the complete db while it is building the indexes whereas background mode allows you to query the db. However background mode of index building takes extra time.
So you must make sure that indexes have been created successfully. You can use db.currentOp() to check progress of ensureIndexes() while it is still creating indexes.
It seems my comment has been a little misunderstood, so I'll clarify. It doesn't really matter when you call it so long as it's called at some point before you call find() for the first time. In other words, it doesn't really matter when you create the index, as long as it's there before you expect to use it.
A common pattern that I've seen a lot is coding the ensureIndex
at the same time (and in the same place) as the find()
call. ensureIndex
will check if the index exists and create it if it doesn't. There is undoubted some overhead (albeit very small) in calling ensureindex before ever call to find() so it's preferable not to do this.
I do call ensureIndex
in code to simplify deployments and to avoid having to manage the db and codebase separately. The tradeoff of ease of deployment balances out the redundancy of subsequent calls to ensureIndex (for me.)
I'd recommend calling ensureIndex once, when your application starts.
I typically put my ensureIndex()
calls within an init block for the part of my application that manages communication with MongoDB. Also, I wrap those ensureIndex()
calls within a check for existence of a collection I know must exist for the application to function; this way, the ensureIndex() calls are only ever called once, ever, the first time the application is run against a specific MongoDB instance.
I've read elsewhere an opinion against putting ensureIndex() calls in application code, as other developers can mistakenly change them and alter the DB (the indexes), but wrapping it in a check for a collection's existence helps to guard against this.
Java MongoDB driver example:
DB db = mongo.getDB("databaseName");
Set<String> existingCollectionNames = db.getCollectionNames();
// init collections; ensureIndexes only if creating collection
// (let application set up the db if it's not already)
DBCollection coll = db.getCollection("collectionName");
if (!existingCollectionNames.contains("collectionName")) {
// ensure indexes...
coll.ensureIndex(BasicDBObjectBuilder.start().add("date", 1).get());
// ...
}