- Yes, I've observed this behavior before, and usually after countless hours of tweaking JVM parameters it starts working.
- Garbage Collection, especially in multithreaded situations is nondeterministic. Defining a bug in nondeterministic code can be a challenge. But you could try DTrace if you are using Solaris, and there are a lot of JVM options for peering into HotSpot.
- Go on Scala IRC and see if Ismael Juma is hanging around (ijuma). He's helped me before, but I think real in-depth help requires paying for it.
- I think most people doing this kind of stuff accept that they either need to be JVM tuning experts, have one on staff, or hire a consultant. There are people who specialize in JVM tuning.
In order to solve these problems I think you need to be able to replicate them in a controlled environment where you can precisely duplicate runs with different tuning parameters and/or code changes. If you can't do that hiring an expert probably isn't going to do you any good, and the cheapest way out of the problem is probably buying more RAM.