Compilers, VMs, and JIT: Spot the difference
It's about time for another demystification post, I think :P This time, I'm going to talk about Compilers, Virtual Machines (VMs), and Just-In-Time Compilation (JIT) - and the way that they are both related and yet different.
Compilers
To start with, a compiler is a program that converts another program written in 1 language into another language (usually of a lower level). For example, gcc compiles C++ into native machine code.
A compiler usually has both a backend and a frontend. The frontend is responsible for the lexing and parsing of the source programming language. It's the part of the compiler that generates compiler warnings and errors. The frontend outputs an abstract syntax tree (or parse tree) that represents the source input program.
The backend then takes this abstract syntax tree, walks it, and generates code in the target output language. Such code generators are generally recursive in nature. Depending on the compiler toolchain, this may or may not be the last step in the build process.
gcc, for example, generates code to object files - which are then strung together by a linking program later in the build process.
Additional detail on the structure of a compiler (and how to build one yourself!) is beyond the scope of this article, but if you're interested I recommend reading my earlier Compilers 101 post.
Virtual Machines
A virtual machine comes in several flavours. Common to all types is the ability to execute instructions - through software - as if they were being executed on a 'real' hardware implementation of the programming language in question.
Examples here include:
- KVM - The Linux virtual machine engine. Leverages hardware extensions to support various assembly languages that are implemented in real hardware - everything from Intel x86 Assembly to various ARM instruction sets. Virtual Machine Manager is a good GUI for it.
- VirtualBox - Oracle's offering. Somewhat easier to use than the above - and cross-platform too!
- .NET / Mono - The .NET runtime is actually a virtual machine in it's own right. It executes Common Intermediate Language in a managed environment.
It's also important to note the difference between a virtual machine and an emulator. An emulator is very similar to a virtual machine - except that it doesn't actually implement a language or instruction set itself. Instead, it 'polyfills' hardware (or environment) features that don't exist in the target runtime environment. A great example here is WINE or Parallels, which allow programs written for Microsoft Windows to be run on other platforms without modification.
Just-In-Time Compilation
JIT is sort of a combination of the above. The .NET runtime (officially knows as the Common Language Runtime, or CLR) is an example of this too - in that it compiles the CIL in the source assembly to native code just before execution.
Utilising such a mechanism does result in an additional delay during startup, but usually pays for itself in the vast performance improvements that can be made over an interpreter - as a JITting VM can automatically optimise code as it's executing to increase performance in real-time.
This is different to an interpreter, which reads a source program line-by-line - parsing and executing it as it goes. If it needs to go back to another part of the program, it will usually have to re-parse the code before it can execute it.
Conclusion
In this post, we've taken a brief look at compilers, VMs, and JIT. We've looked at the differences between them - and how they are different from their siblings (emulators and interpreters). In many ways, the line between the 3 can become somewhat blurred (hello, .NET!) - so learning the characteristics of each is helpful for disentangling different components of a program.
If there's anything I've missed - please let me know! If you're still confused on 1 or more points, I'm happy to expand on them if you comment below :-)
Found this useful? Spotted a mistake? Comment below!