I understand how a language can bootstrap itself, but I haven\'t been able to find much reference on why you should consider bootstrapping.
The int
One advantage would be that developers working on the compiler would only need to know the language being compiled. Otherwise developers would need to know the language being compiled as well as the language the compiler is written in.
You don't bootstrap a compiler for DSL. You don't write an SQL query compiler in SQL. MATLAB might look like a general purpose language, but actually it isn't -- it is a language designed for numerical calculations.
There are a couple of reasons you might want to do it (in theory):
As a concrete example, in version 1.5 (released august 2015), Go switched into being a fully bootstrapped language[1][2]. They listed the following reasons:
Of these, the only one that would hold for all languages is that you only need to know one language to contribute to the compiler. The other arguments can be summarized as "Our new language is better than the old one". Which should probably be true, why else would you write a new language?
Ken Thompson's Reflections on Trusting Trust explains one of the best reasons for bootstrapping. Essentially, your compiler learn new things for every version of the compiler in the bootstrapping chain that you will never have to teach it again.
In the case he mentions, The first compiler (C1) you write has to be explicitly told how to handle backslash escape. However, the second compiler (C2) is compiled using C1, so backslash escape handling is natively handled.
The cornerstone of his talk is the possibility that you could teach a compiler to add a backdoor to programs, and that future compilers compiled with the compromised compiler would also be given this ability and that it would never appear in the source!
Essentially, your program can learn new features at every compilation cycle that do not have to be reimplemented or recompiled in later compilation cycles because you compiler knows all about them already.
Take a minute to realise the ramifications.
[edit]: This is pretty terrible way to build a compiler, but the cool factor is through the roof. I wonder if it could be manageable with the right framework?
Compilers solve a wide variety of non-trivial problems including string manipulation, handling large data structures, and interfacing with the operating system. If your language is intended to handle those things, then writing your compiler in your language demonstrates those capabilities. Additionally, it creates an exponential effect because as your language includes more features, you have more features you can use in your compiler. If you implement any unique features that would make compiler-writing easier, you have those new tools available to implement even more features.
However, if your language is not intended to handle the same problems as compilation, then bootstrapping will only tempt you to clutter your language with features which are related to compilation but not to your target problem. Self-compilation with Matlab or SQL would be ridiculous; Matlab has no reason to include strong string manipulation functions and SQL has no reason to support code generation. The resulting language would be unnecessary and cluttered.
It's also worth noting that interpreted languages are a slightly different problem and should be treated accordingly.