How efficient will be to use a in memory database to store millions of temporary values?

左心房为你撑大大i 提交于 2019-12-06 18:42:28

The problem is sufficiently simple that you really need to just give it a go and see how the (performance) results work out.

You already have an implementation that just uses simple in-memory structures. Personally, given that even the cheapest computer from Dell comes with 1GB+ of RAM, you might as well stick with that. That aside, it should be fairly simple to wack in a database or two. I'd consider Sleepycat Berkerly DB (Which is now owned by Oracle...), because you don't need to use SQL and they should be quite efficient. (They do support Java).

If the results are promising, I'd then consider further investigation, but this really should only take a few days work, at most, including the benchmarking.

A simple HashMap backed up by Terracotta would do better and will allow to store collection bigger then JVM virtual memory.

Embedded databases, especially, the SQL-based ones, will add complexity and overhead to your code, so it doesn't worth it. If you really need a persistent storage with random access, try one of nosql DBs, like CouchDB, Cassandra, neo4j

I don't know whether it will be faster, so you'd have to try it. What I do want to recommend is to do batch inserts of an entire list when you don't immediately need that list anymore. Don't save value by value :)

If you're end algorithm can be expressed in SQL it might also be worth your while to do that, and not load all Lists back in. In any case, don't put anything like an index or constraint on the values, and preferably also not allow NULL (if possible). Maintaining indices and constraints cost time, and allowing NULL can also cost time, or create overhead. deal_ids can (and are) of course indexed as they're primary keys.

This isn't very much but at least better than a single down-voted answer :)

There really is no reason at all to add an external component to make your program run slower. Compress the data block and write it to file if you need to handle more than the internal memory available. A workstation now takes 192GB of ram so you can't afford to waste much time on it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!