I am developing a scientific application used to perform physical simulations. The algorithms used are O(n3), so for a large set of data it takes a very long time to process.
Have you looked at Terracotta?
For work distribution you'll want to use the Master/Worker framework.