For Windows, check out the free Xperf that ships with the Windows SDK. It uses sampled profile, has some useful UI, & does not require instrumentation. Quite useful for tracking down performance problems. You can answer questions like:
Who is using the most CPU? Drill down to function name using call stacks.
Who is allocating the most memory?
Outstanding memory allocations (leaks)
Who is doing the most registry queries?
Disk writes? etc.