Bulk insert performance in MongoDB for large collections

后端 未结 3 1595
北海茫月
北海茫月 2020-12-15 21:48

I\'m using the BulkWriteOperation (java driver) to store data in large chunks. At first it seems to be working fine, but when the collection grows in size, the inserts can t

相关标签:
3条回答
  • 2020-12-15 22:07
    • Disk utilization & CPU: Check the disk utilization and CPU and see if any of these are maxing out. Apparently, it should be the disk which is causing this issue for you.

    • Mongo log: Also, if a 1000 bulk query is taking 10sec, then check for mongo log if there are any few inserts in the 1000 bulk that are taking time. If there are any such queries, then you can narrow down your analysis

    Another thing that's not clear is the order of queries that happen on your Mongo instance. Is inserts the only operation that happens or there are other find queries that run too? If yes, then you should look at scaling up whatever resource is maxing out.

    0 讨论(0)
  • 2020-12-15 22:18

    You believe that the indexing does not require any document reorganisation and the way you described the index suggests that a right handed index is ok. So, indexing seems to be ruled out as an issue. You could of course - as suggested above - definitively rule this out by dropping the index and re running your bulk writes.

    Aside from indexing, I would …

    • Consider whether your disk can keep up with the volume of data you are persisting. More details on this in the Mongo docs
    • Use profiling to understand what’s happening with your writes
    0 讨论(0)
  • 2020-12-15 22:24
    1. Do have any index in your collection? If yes, it has to take time to build index tree.
    2. is data time-series? if yes, use updates more than inserts. Please read this blog. The blog suggests in-place updates more efficient than inserts (https://www.mongodb.com/blog/post/schema-design-for-time-series-data-in-mongodb)
    3. do you have a capability to setup sharded collections? if yes, it would reduce time (tested it in 3 sharded servers with 15million ip geo entry records)
    0 讨论(0)
提交回复
热议问题