I know Microsoft .NET uses the CLR as a JIT compiler while Java has the Hotspot. What Are the differences between them?
They are very different beasts. As people pointed out, the CLR compiles to machine code before it executes a piece of MSIL. This allows it in addition to the typical dead-code elimination and inlining off privates optimizations to take advantage of the particular CPU architecture of the target machine (though I'm not sure whether it does it). This also incurs a hit for each class (though the compiler is fairly fast and many platform libraries are just a thin layer over the Win32 API).
The HotSpot VM is taking a different approach. It stipulates that most of the code is executed rarely, hence it's not worth to spend time compiling it. All bytecode starts in interpreted mode. The VM keeps statistics at call-sites and tries to identify methods which are called more than a predefined number of times. Then it compiles only these methods with a fast JIT compiler (C1) and swaps the method while it is running (that's the special sauce of HS). After the C1-compiled method has been invoked some more times, the same method is compiled with slow, but sophisticated compiler and the code is swapped again on the fly.
Since HotSpot can swap methods while they are running, the VM compilers can perform some speculative optimizations that are unsafe in statically compiled code. A canonical example is static dispatch / inlining of monomorphic calls (polymorphic method with only one implementation). This is done if the VM sees that this method always resolves to the same target. What used to be complex invocation is reduced to a few CPU instructions guard, which are predicted and pipelined by modern CPUs. When the guard condition stops being true, the VM can take a different code path or even drop back to interpreting mode. Based on statistics and program workload, the generated machine code can be different at different time. Many of these optimizations rely on the information gathered during the program execution and are not possible if you compile once whan you load the class.
This is why you need to warm-up the JVM and emulate realistic workload when you benchmark algorithms (skewed data can lead to unrealistic assesment of the optimizations). Other optimizations are lock elision, adaptive spin-locking, escape analysis and stack allocation, etc.
That said, HotSpot is only one of the VMs. JRockit, Azul, IBM's J9 and the Resettable RVM, - all have different performance profiles.