Does anyone know of a tool that can help me figure out why we are seeing runaway CPU in a managed app?
What I am not looking for:
Use the Managed debugger. Helped me before. Just a few files needed. You could probably just see what is happening (perhaps exception handling stuck in a loop).
It does sound like you need a real profiler, but I thought I'd just throw this out there: PerfMon. It comes with windows, you can setup a perfmon profile that you can send to the user, they can capture and send you the log.
Here's a couple links I've kept around every time I need a perfmon refresher: TechNet magazine from 2008 and a post from the Advanced .NET Debugging blog.
The worse a problem is, the easier it is to find by this technique.
There is a tool you can get, called Stackshot, that might help in your case. Look here and here.
I think you should look at memory and disk usage as well. If a machine runs out of memory and needs to start using virtual memory (on the disk drive), you'll see a spike in CPU and disk activity. In such conditions what looks like a CPU bottleneck is actually a memory bottleneck.