As I investigated, there are 2 major ways to implement process VM:
- stack-based, such as JVM, CLR, etc.
- or register-based, such as Lua, Dalvik, etc.
Register-based approach mimics the architecture of physical processors. But for the stack-based approach, there are many other data structures.
I think which approach to choose mainly depends on how we want to store/fetch operands. So why choose stack? How about queue-based VM? Or other options such as linked list?
StackOverflow is really not for opinion-based surveying; it's likely to be closed.
However, it's key to realise that processor-specific architectures generally use registers because that corresponds to how the processors work. Each instruction set has a fixed set of registers available, and different architectures will have different amounts. Furthermore, some registers have specific meanings based on the platform or ABI.
A generic assembly file is then difficult to port between platforms; or if you can, you end up with the minimum possible and therefore miss out optimisations that may be applicable on wider instructions. One of the advantages of 64 bit processors isn't so much the extra width, but rather the increased number of registers available in the ISA.
There are two ways of solving this; either assume an infinite register set, and then provide register mapping when it's translated for a specific architecture (e.g. LLVM uses an infinite register set in its ISA which is then mapped onto specific registers on the real ISA) or use a stack. One advantage of using a stack is that you don't need to deal with the specific case of spilling the registers when you run out (e.g. you have a function which you'd like to have 10 registers but your ISA only has 5). Stacks are very good at representing infinite entries (modulo the amount of available memory, of course.)
That said, (real) stacks are slower - effectively you're ignoring real registers when you use that. So typically registers are used for speed, and stacks are used for things that don't fit in the registers.
Anyway - the VM codes use a stack because instructions like push
and pop
only deal with single values, whereas register based instructions typically use bitflags in the opcode encoding to indicate which of the N registers to use. So defining a stack based ISA gives you a fully flexible infinite set of data points, and the interpreter/JIT can then translate those to specific registers on demand - in effect, doing what LLVM does at compile time to run-time optimisations. This allows Java programs running on 64 bit systems automatically able to pick up on the larger register set without needing a recompilation from running on 32 bit - and you get the wider registers set automatically.
The 'stack' here is a logical concept rather than a particular data structure. It wouldn't really make sense to use non-contiguous non-growing data structures. And stacks are used specifically because typically calling a new subroutine/function/method generates a new 'stack' space, so you always return from that subroutine before you return from the enclosing method, so you only ever pop/push one end (e.g. a queue isn't relevant here). Amongst other things, that's why you get StackOverflow exceptions and why this site is called StackOverflow and not QueueOverflow.
来源:https://stackoverflow.com/questions/37317614/why-stack-based-vm-why-not-queue-based-vm