To what extent are 'lost data' criticisms still valid of MongoDB?

后端 未结 4 1457
没有蜡笔的小新
没有蜡笔的小新 2021-01-30 00:17

To what extent are \'lost data\' criticisms still valid of MongoDB? I\'m referring to the following:

1. MongoDB issues writes in unsafe ways b

4条回答
  •  别那么骄傲
    2021-01-30 00:28

    Never heard of those severe problems in recent versions. What you need to consider is that MongoDB has no decade of development as relational Systems in the back. Further it may be true that MongoDB doesn't offer that much functionality to avoid data loss at all. But even with relational Systems you won't be ever sure that you'll never loose any data. It highly depends on your system configuration (so with Replication and manual data backups you should be quite safe).

    As a general guideline to avoid Beta Bugs or bugs from early versions, avoid to use fresh versions in productions (there's a reason why debian is so popular for servers). If MongoDB would suffer such severe problems (all the time) the list of users would be smaller: https://www.mongodb.com/community/deployments Additionally I don't really trust this pastebin message, why is this published anonymously? Is this person company shamed to tell that they used mongodb, do they fear 10gen? Where a links to those Bug reports (or did 10gen delete them from JIRA?)

    So lets talk shortly about those points:

    1. Yep MongoDB operates normally in fire and forget mode. But you can modify this bevavior with several options: https://docs.mongodb.com/manual/reference/command/getLastError/#dbcmd.getLastError. So only because MongoDB defaults to it, it doesn't mean you can't change it to your needs. But you need to live less performance if you don't fire and forget within your app, as you're adding a roundtrip.

      Update: Since version 2.6, the commands insert, update, save, remove by default acknowledges the write.

    2. Never heard of such problems, except those caused to own failure...but that can happen with relational systems as well. I guess this point only talks about Master-Slave Replication. Replica-Sets are much never and stable. Some links from the web where other dbms caused data loss due to malfunction as well: http://blog.lastinfirstout.net/2010/04/bit-by-bug-data-loss-running-oracle-on.html http://dbaspot.com/oracle-server/430465-parallel-cause-data-lost-another-oracle-bug.html http://bugs.mysql.com/bug.php?id=18014 (Those posted links aren't in any favor of a given system or should imply anything else than showing that there are bugs in other systems as well, who can cause data loss.)

    3. Yes actually there's Locking at instance level, I don't think that in sharded environment this is a global one, I think this will be at instance level for each shard separate, as there's no need to lock other instances as there are no consistency checks needed. The upcoming Version 2.2 will lock at DB Level, tickets for Collection Level and maybe extend or document exists as well (https://jira.mongodb.org/browse/SERVER-4328). But locking at deeper levels may affect the actual performance of MongoDB, as a lock management is expensive.

    4. Moving chunks shouldn't cause much problems as rebalancing should take a few chunks from each node and move them to the new one. It never should cause ping/pong of chunks nor does rebalancing start just because of a difference of one or two chunks. What can be problematic is when your shard key is choosen wrong. So you may end up with all new entries inserted to one node rather than all. So you would see more often rebalancing which can cause problems, but that would be not due to mongo rather than your poorly choosen shardkey.

    5. Can't comment on this one

    6. Not 100% sure, but I think Replicasets where introduced in 1.6, so as told earlier never use the latest version for production, except you can live with loss of data. As with every new feature there's the possibility of bugs. Even extensive testing may not reveal all problems. Again always run some manual backup for gods sake, except you can live with data loss.

    7. Can't comment on this. But in reality software may contain severe bugs. Many games suffer those problems as well and there are other areas as well where banana software was quite well known or is. Can't Comment about something concrete as this was before my MongoDB time.

    8. Replication can cause such problems. Depending on the replication strategy this may be a problem and another system may fit better. But without a really really write intensive workload you may not encounter such problems. Indeed it may be problematic to have 3 replicas polling changes from one master. You could cure the problem by adding more shards.

    As a general conclusion: Yeah it may be that those problems were existent, but MongoDB did much in this direction and further I doubt that other DBMS never had the one or other problem itself. Just take traditional relational dbms, would those scale well to web-scale there would be no need for Systems like MongoDB, HBase and what else. You can't get a system which fits all needs. So you have to live with the downsides of one or try to build a combined system of multiple to get what you need.

    Disclaimer: I'm not affiliated with MongoDB or 10gen, I'm just working with MongoDB and telling my opinion about it.

提交回复
热议问题