I have an algorithm which currently allocates a very large array of doubles, which it updates and searches frequently. The size of the array is N^2/2, where N is the number of
If you are starting to run out of available memory, then you will probably also soon start to run out of available array indexes, an array is bounded in size to Integer.MAX_VALUE
, and that when using doubles as the array elements is "only" 32GB in size.
Getting a machine with 32GB of memory is expensive, but probably not as expensive as your time to modify the algorithm, and all of the associated testing.
However, if the client is running to the edges of memory, and their datasets are still growing, then it makes sense for you to bite the bullet now, and make the changes to be able to use less memory at any given time, since they will likely soon outgrow an array anyway.
The other option that you have, assuming that the array is somewhat sparsely filled, is to use one of the various sparse array data structures, although these tend to only be beneficial if your array is less than 20% full.
Edit: Since it seems that you have already investigated the alternatives, then the MappedByteBuffer may well be the way to go. Obviously this is going to have a performance impact, however if you do mostly sequential reads and writes from the array, then this should not be too bad. If you are doing random reads and writes, then this is going to get very slow very fast. Or very slow very slowly... depending on how you look at these things ;-)
If you're running on PCs, page sizes for mapped files are likely to be 4 kilobytes.
So the question really starts from if I start swapping the data out to disk, "how random is my random access to the RAM-that-is-now-a-file"?
And (...can I and if so...) how can I order the doubles to maximise cases where doubles within a 4K page are accessed together rather than a few at a time in each page before the next 4K disk fetch?
If you use standard IO, you probably still want to read and write in chunks but ther chunks could be smaller. Sectors will be at least 512 bytes, disk clusters bigger, but what size of read is best given that there is a kernel round trip overhead for each IO?
I'm sorry but I'm afraid your best next steps depend to a great extent on the algorithm and the data you are using.
Be aware that some operating systems have better support for memory mapping than others.
I would be tempted to do this:
You might find you have more control over performance that way - the -Xmx can be tweaked as desired.
I've had generally good experiences with Java's MappedByteBuffers, and encourage you to have a deeper look at it. It very well may allow you to not deal with the -Xmx
changes again. Be aware that if you need more than 2-4GB of addressable space then a 64-bit CPU, OS and JVM are required.
To get beyond the Integer.MAX_VALUE
indices issue you could write a paging algorithm, as I have done here in a related answer to Binary search in a sorted (memory-mapped ?) file in Java.
You can try storing the array as rows in a database table and use stored procs to do updates and searches on it.
Another Idea:
Use a B-Tree as your array and keep some leaves on disk. Make sure and make the nodes of the B-Tree the size of a page or the size of multiple pages.
You are moving in the realm of how to write software that utilizes a cache (as in memory cache in the cpu) best. This is hard to do right, and the "right" way to do it depends on how your algorithm is designed.
So, what does your program actually do algorithmically?