This is yet another \"please tell me how to force the Java garbage collector to run\" question. In our application, I believe we have good reasons for doing this.
This i
Your problem is that you're running two applications with entirely different requirements and memory profiles in the same JVM.
Run the data analysis separately, in a non-user-facing process, so that the user-facing server remains constantly responsive. I assume the periodic analysis generates a summary or result data of some kind; make that available to end users by shipping it across to the user-facing server so it can be served from there, or else let your front end fetch it separately from the analysis server.
I found that java GC deals very poorly with a large number of objects (20-100m objects). your situation would have been worse if those objects would actually had remain alive, because GC would be horrible even if there was nothing to actually collect.
the solution is to reduce the number of objects (not the total memory you are using). I would dare to guess that you analysis phase is using collections and many primitive wrappers (Integer, Long etc). if this is the case, one solution is to switch to a primitive collections library. one such library is what I created to solve a similar problem I encountered where I ran a simulation with 100m live objects for a long time. That library is called Banana, see the wiki for details.
Consider using non-managed memory, i.e., ByteBuffer
s in place of the byte arrays.
I can only offer a hack which will need some tuning and then might or might not work. I'd first try the more sane solutions. When you want to force the GC, do it by allocating a lot of memory. Do this so that the memory can be immediately reclaimed, but so that the whole allocation can't be optimized away (something like sum += new byte[123456].hashCode()
should do). You'll need to find a reliable method for determining when to stop. An object with a finalizer might tell you or maybe watching runtime.getFreeMemory
could help.
Rather than answer your question directly (I can't), I'd like to offer a possible alternative.
It sounds like you are allocating a large number of large byte arrays during your analysis run, and then allowing them to be garbage collected at the end of the run (or attempting to force them to be garbage collected just before the next run).
Instead, if possible, try managing your own pool of byte arrays, so that, in the best case, you allocate all of the needed arrays once when the application is first started, and then they live for the lifetime of the application, and don't need to garbage collected.
This idea can, of course, be extended to more complex data structures and object instances.
This is all quite a bit more work than just allocating memory when you need it, and 'freeing' it when you don't, but should cut down considerably on the work that the garbage collector needs to do.