Background of the Problem: I\'m trying to write a puzzle solution algorithm that takes advantage of multi-core processors and parallel processing. However, the ideal/easiest s
The type of problem reminds me of genetic algorithms. You already have a fitness function (the cost) and the layout of the problem seems suited to crossover and mutation. You could use one of the available G.A. engines and run multiple pools/generations in parallel. G.A's tend to find good solutions quite fast, although finding the absolute best solution is not guaranteed. On the other hand I believe the puzzle you describe does not necessarily have a single optimal solution anyway. G.A. solutions are often used for scheduling (for example to create a roster of teachers, classrooms and classes). The solutions found are usually 'robust' in the sense that a reasonable solution catering a change in the constraints can often be found with a minimal number of changes.
As to parallelizing the given recursive algorithm. I tried this recently (using Terracotta) for the n-Queens problem and did something simlar to what you descibe. The first-row queen is placed in each possible column to create n subproblems. There is a pool of worker threads. A job scheduler checks if there is an idle worker thread available in the pool, and assigns it a subproblem. The worker thread works through the subproblem, outputting all found solutions, and returns to idle state. Because there are typically far fewer worker threads than subproblems, it is not a big issue if subproblems don't take equal amounts of time to solve.
I'm curious to hear other ideas.
you could use monte carlo and run them parallely. add some randomness in term of selection of piece to get based on constraints.
JSR-166Y is intended to facilate the implementation of parallel recursion in Java 7 by taking care of thread coordination. You may find their discussions, code, and papers (especially Doug Lea's paper A Java Fork/Join Framework) useful.