How to create a Linux cluster for running physics simulations in java?

后端 未结 8 539
花落未央
花落未央 2021-02-03 15:07

I am developing a scientific application used to perform physical simulations. The algorithms used are O(n3), so for a large set of data it takes a very long time to process.

相关标签:
8条回答
  • 2021-02-03 15:14

    I would very highly recommend the Java Parallel Processing Framework especially since your computations are already independant. I did a good bit of work with this undergraduate and it works very well. The work of doing the implementation is already done for you so I think this is a good way to achieve the goal in "number 2."

    http://www.jppf.org/

    0 讨论(0)
  • 2021-02-03 15:15

    Simplest way to distribute computing on a Linux cluster is to use MPI. I'd suggest you download and look at MPICH2. It's free. their home page is here

    If your simulations are completely independent, you don't need most of the features of MPI. You might have to write a few lines of C to interface with MPI and kick off execution of your script or Java program.

    0 讨论(0)
  • 2021-02-03 15:15

    You should check out Hazelcast, simplest peer2peer (no centralized server) clustering solution for Java. Try Hazelcast Distributed ExecutorService for executing your code on the cluster.

    Regards,

    -talip

    0 讨论(0)
  • 2021-02-03 15:15

    You already suggested it, but disqualified it: Multi cores. You could go for multi core, if you had enough cores. One hot topic atm is GPGPU computing. Esp. NVIDIAs CUDA is a very priomising approach if you have many independent task which have to do the same computation. A GTX 280 delivers you 280 cores, which can compute up to 1120 - 15360 threads simultanously . A pair of them could solve your problem. If its really implementable depends on your algorithm (data flow vs. control flow), because all scalar processors operate in a SIMD fashion.

    Drawback: it would be C/C++, not java

    0 讨论(0)
  • 2021-02-03 15:19

    How optimized are your algorithms? Are you using native BLAS libraries? You can get about an order of magnitude performance gain by switching from naive libraries to optimized ones. Some, like ATLAS will also automatically spread the calculations over multiple CPUs on a system, so that covers bullet 1 automatically.

    AFAIK clusters usually aren't treated as a single entity. They are usually treated as separate nodes and programmed with stuff like MPI and SCALAPACK to distribute the elements of matrices onto multiple nodes. This doesn't really help you all that much if your data set fits in memory on one node anyways.

    0 讨论(0)
  • 2021-02-03 15:22

    Ten years ago, the company I worked for looked at a similar virtualization solution, and Sun, Digital and HP all supported it at the time, but only with state-of-the-art supercomputers with hardware hotswap and the like. Since then, I heard Linux supports the type of virtualization you're looking for for solution #3, but I've never used it myself.

    Java primitives and performance

    However, if you do matrix calculations you'd want to do them in native code, not in Java (assuming you're using Java primitives). Especially cache misses are very costly, and interleaving in your arrays will kill performance. Non-interleaved chunks of memory in your matrices and native code will get you most of the speedup without additional hardware.

    0 讨论(0)
提交回复
热议问题