I have a java program that is a typical machine learning algorithm, updating the values for some parameters by some equations:
for (int iter=0; iter<1000; ite
Giving Java (or any garbage collecting language) too much memory has an adverse effect on performance. The live (referenced) objects become increasing sparse in memory resulting in more frequent fetches from main memory. Note that in the examples you've shown us the faster windows is doing more quick and full GC than Linux - but GC cycles (especially full gcs) are usually bad for performance.
If running the training set does not take an exceptionally long time, then try benchmarking at different memory allocations.
A more radical solution, but one which should have a big impact is to eliminate (or reduce as much as possible) object creation within the loop by recycling objects in pools.