To what extent are \'lost data\' criticisms still valid of MongoDB? I\'m referring to the following:
1. MongoDB issues writes in unsafe ways b
Note on Context:
This question was asked in 2012, but still sees traffic and votes to this day. The original answer was specifically to refute a particular post that was popular at the time of the question. Things have changed (and continue to change) massively since this answer was written. MongoDB has certainly become far more durable and reliable than it was in 2012 when even things like basic journaling were relatively new. I get downvotes and comments on this answer because people feel I don't address the current (for a given value of current) general answer to the titular question (not the detail): "are lost data criticisms still valid?". I have attempted to clarify in updates below, but there is basically no perfect answer to this question, it depends on your perspective, what your expectations are/were, what version you are using, what configuration, whether you feel upset about the default settings etc.
Original Answer:
That particular post was debunked, point by point by the MongoDB CTO and co-founder, Eliot Horowitz, here:
http://news.ycombinator.com/item?id=3202959
There is also a good summary here:
http://www.betabeat.com/2011/11/10/the-trolls-come-out-for-10gen/
The short version is, it looks like this was basically someone trolling for attention (successfully), with no solid evidence or corroboration. There have been genuine incidents in the past, which have been dealt with as the product evolved (see the introduction of journaling in 1.8 for example) or as more specific bugs were found and fixed.
Disclaimer: I do work for MongoDB (formerly 10gen), and love the fact that philnate got here and refuted this independently first - that probably says more about the product than anything else :)
Update: August 19th 2013
I've seen quite a bit of activity on this answer recently, which I assume is related to the announcement of the bug in SERVER-10478 - it is most certainly an edge case, but I would still recommend anyone using sharding with large documents to upgrade ASAP to v2.2.6 and v2.4.6 which include the fix for this issue.
Update: March 24th 2017
I no longer work for MongoDB, but stand behind this answer nonetheless. Given that this answer continues to get up (and down) votes and receives a lot of views I would like to point people at this post which shows the progress MongoDB has made since this question was posed. The database now passes the Jepsen tests, and has integrated the tests into its build process, there are plenty of far more mature systems that do not pass. Anyone still beating the data loss drum in 2017 really hasn't been paying attention.
Update: May 24th 2020
Jepsen has re-analyzed MongoDB 4.2.6 given that MongoDB now offers "full ACID transactions" and while it gets quite technical in parts, I highly recommend reading the article if data loss in MongoDB is a concern for you (I would recommend checking out any database you use that Jepsen tests, you might be surprised at their weak spots). The report summarizes the weaknesses in the default read and write concerns, talks about how reliable non-transaction reads and writes are with appropriate read and write concerns, addresses flaws in the documentation, and then provides significant details about the issues encountered when testing the new ACID transactions (and associated read/write concerns).
So, can you still lose data with MongoDB? Yes, especially with default settings, but that is true of most databases. Things are vastly better than they were back when this question was answered, and the capabilities are there for more reliability and durability, and they seem to work (transactions aside). My advice is to learn what the limitations of the configuration are that you operate and to then determine whether the data loss risk is acceptable or not for your product/business/use case.