Is the CLR a virtual machine?

问题

I read a book which referred to the .net CLR as a virtual machine? Can anyone justify this? What is the reason we need the concept of virtual machines on some development platforms?

Isn't it possible to develop a native framework [one without virtual machine] that is fully object oriented and as powerful as .net?

The book which refers to CLR as virtual machine is "Professional .Net Framework 2.0".

回答1:

There are a lot of misconceptions here. I suppose you could think of .Net as a virtual machine if you really wanted, but let's look at how the .Net Framework really handles your code. The typical scenario looks like this

You write a .Net program in C#, VB.Net, F#, or some other compatible language.
That code is compiled down to an Intermediate Language (IL), which is similar to Java's bytecode, that is distributed to end-user machines.
An end user invokes the program for the first time on a computer with the right version of .Net installed
The computer sees this is a .Net assembly rather than "raw" machine code, and passes it off to the JIT compiler
The JIT compiler compiles the IL to fully-native machine code.
The native code is saved in memory for the life of this program execution.
The saved native code is invoked, and the IL no longer matters.

There are a couple important points here, but the big one is that at no point is any code ever interpreted. Instead, you can see in step 5 that it is compiled to native code. This a huge difference than loading it into a virtual machine, for several reasons:

The fully-compiled code is executed by the cpu directly rather than interpreted or translated by an additional software abstraction layer, which should be faster.
The JIT compiler can take advantage of optimizations specific to the individual machine running the program, rather than settling for a lowest common denominator.
If you want you can even pre-compile the code and in essence hide step 5 from the user completely.

I suppose you could call this a virtual machine, in the sense the JITter abstracts away the details of the real machine from the developer. Personally I don't think that's really right, because for many people, a virtual machine implies a runtime abstraction away from native code that for .Net programs just doesn't exist.

One other key point about this whole process that really sets it apart from a "virtual machine" environment is that it's only the typical process. If you really want to, you can pre-compile a .Net assembly before distribution and deploy native code directly to end users (hint: it's slower in aggregate over the life of the program, because you lose machine-specific optimizations). Of course, you still need the .Net runtime installed, but at this point it's really not much different from any other runtime API; it's more like a collection dlls with a nice API you can link against, as you might have with the VB or C runtimes Microsoft also ships with Visual Studio. This kind of takes the IL out of the picture, making the VM moniker much harder to justify. (I say "kind of" because the IL is still deployed and used to verify the saved code, but it's never itself touched for execution).

One other telling point is the lack of a VM process. When you run your app, there's no common "sandbox" process that runs. Compare this with Java, where if you open the task manager when a program is running you will see a process specifically for the Java VM, and the application's actual process is a thread inside of the sandbox created by the VM. In .Net, you see the application's process in the Windows task manager directly.

In summary: you could say that IL + CLR + JIT together somehow make up a virtual machine. Personally I don't think so, but I won't argue with you if you believe that. The point I want to make is that when you tell someone that .Net runs in a virtual machine with no further explanation, the idea you are communicating to that person is "interpreted bytecode in a host process." And that's just wrong.

回答2:

Similar to the Java Virtual Machine (JVM), the .net CLR is a byte-code interpreting virtual machine.

The JVM interprets programs which contain java byte codes and the .net CLR interprets programs which contain what Microsoft calls "Intermediate Language (IL)" instructions. There are differences between these byte codes, but the virtual machines are similar and aspire to provide similar features.

Both of these virtual machine implementations have the ability to compile their input bytecode to the machine language of the computer they are running on. This is called "Just In Time Compilation (JIT)" and the output code produced is called "JIT code." Because the JIT code contain sequences of instructions in the machine language of the computer's CPU, this code is sometimes referred to as "native" code.

However, JIT code is qualitatively and quantitatively different from native code, as explained below. For that reason, this article considers JIT code to be nothing more than a native implementation of the Virtual Machine while running a particular bytecode program.

One feature that both these Virtual Machines (VMs) aspire to provide is security in the form of preventing certain hazardous programming errors. For example, the title of this website forum, stackoverflow, is inspired by one such type of hazardous error that is possible in native code.

In order to provide safety and execution security, the VMs implement type safety at the "Virtual Machine level". Assignments to VM memory are required to store the type of data which is held in that memory location. For example, if an integer is pushed onto the stack, it is not possible to pop a double from the stack. C-style "unions" are prohibited. Pointers and direct access to memory are prohibited.

We could not get the same benefits by enforcing an object oriented language framework on developers if the result is a native binary such as an EXE file. In that case, we would not be able to distinguish between native binaries generated using the framework and EXEs generated by a malicious user employing sources other than the framework.

In the case of the VMs, the type-safety is enforced at the "lowest level" that the programmer is allowed to access. (Neglecting for a moment that it is possible to write managed native code, that is.) Therefore, no user will encounter an application which performs one of the hazardous operations which require direct access to memory locations and pointers.

In practice, the .net CLR implements a way to write native code which can be called by .net "managed" code. In this case, the burden is on the native code author not to make any of the pointer and memory mistakes.

As both the JVM and .net CLR perform JIT compilation, either VM actually creates a native-compiled binary from the bytecode supplied. This "JIT code" performs more quickly than the VM's interpreter execution, because even the machine language code produced by JIT contains all the VM's needed safety checks that the VM would perform. As a result, the JIT output code is not as fast as native code which would ordinarily not contain numerous run-time checks. However, this speed performance drawback is exchanged for an improvement to reliability including security; in particular, use of uninitialized storage is prevented, type-safety of assignments is enforced, range-checking is performed (thus stack- and heap- based buffer overflows prevented), object lifetimes are managed by garbage collection, dynamic allocation is type safe. An environment executing such run-time behavior checks is implementing the specification of a virtual machine and is little more than a machine language realization of a virtual machine.

回答3:

The "Virtual Machine" part refers to the fact that .NET code is compiled into EXE's and DLL's as "Intermediate" Assembly language (IL) to run on a virtual machine, as opposed to real CPU assembly language. Then, at runtime the ILM is converted into real CPU assembly for execution (referred to as Just-in-time, or JIT compiling).

Sure, you could write a .NET compiler so that it would be compiled into CPU assembly language instead of IL. However, this would not be portable to all CPUs - you'd have to to compile a different version for each OS/CPU pair. But by compiling into ILM, you let the "Virtual Machine" handle the CPU and OS specific stuff.

回答4:

I am a bit old school, so i call the CLR a virtual machine as well. My reasoning is that the CLR assembles the machine code from an intermediate bytecode, which is what a virtual machine also does.

The benefits of the CLR is mainly due to the way it assembles the machine code which utilizes runtime type information.

You can develop a native framework as powerful as the .NET framework using just native types. The only flexibility you lose is the ability to reassemble the native code if you ever transport your program to another platform without recompiling.

回答5:

The advantage of the CLR is the freedom to write code in whatever programming language the developer chooses, since the code will be compiled down to CLR before being interpreted into native calls. The .NET framework uses this JIT compilation to treat everything uniformly and output programs which work for the platform being deployed on, which is absent from compiled languages.

回答6:

Neither the JVM nor the CLR do anything that is materially different than what most "virtual machines" for other languages also do. Modernly, they all use JIT to convert virtual instructions (p-code, bytecodes, intermediate language instructions, call it whatever you like) to "native CPU hardware" instructions ("machine code.")

In fact, the first "virtual machine" to do this was the Smalltalk virtual machine. The author of that innovation, Peter Deutsch, dubbed it "dynamic translation" instead of the term "JIT," which was popularized by Java. If the Smalltalk "runtime execution environment" is going to be called a "virtual machine" (and that's what it's still called,) then any and all other "run time systems" that do essentially the same thing also qualify as "virtual machines."

回答7:

You've got many valuable answers, but I think one thing hasn't been mentioned yet: Modularity.

It's quite hard to export a OO class from native DLL. Sure, you can tell the linker to export the class and import it somewhere else, but this is brittle; Changing a single private member in a class will break binary compatibility, i.e. if you change one DLL without recompiling all the other modules, your program will crash horribly at runtime.

There are some ways around this: For example, you can define public abstract interfaces, derive from those and export global factory functions from your DLL. That way, you can change implementation details of a class. But you can't derive from that class in another DLL. And changing the interface also breaks binary compatibility, of course.

I'm not sure if there is a good solution for this in native code: If the compiler/linker creates native code at compile time, then it must know the exact memory layout of the classes/structures that are used in code. If the last compilation step (generating native code) is delayed until a method is called for the first time, this problem simply goes away: you can modify a class in an assembly, and as long as the JIT can resolve all the used members at runtime, everything will run fine.

In a nutshell: If you create a monolithic single-executable program, you could probably have most of the powerful features of .NET with a compiler that creates native code. But the disadvantages of having a JIT compiler (framework installation, slightly longer startup times) really don't outweigh the benefits in most cases.

来源：https://stackoverflow.com/questions/1564348/is-the-clr-a-virtual-machine

标签

.net

clr

vm-implementation