Is there any self-improving compiler around?

喜你入骨 提交于 2019-11-28 08:25:33

While it is true that compilers can improve code without human interference, however, the claim that "compilers are self-improving" is rather dubious. These "improvements" that compilers make are merely based on a set of rules that are written by humans (cyborgs anyone?). So the answer to your question is : No.

On a side note, if there was anything like a self improving compiler, we'd know... first the thing would improve the language, then its own code and finally, it would modify its code to become a virus and make all developers use it... and then finally we'd have one of those classic computer-versus-humans-last-hope-for-humanity kind of things... so ... No.

MilepostGCC is a MachineLearning compiler, which improve itself with time in the sense that it is able to change itself in order to become "better" with time. A simpler iterative compilation approach is able to improve pretty much any compiler.

25 years of programming and I have never heard of such a thing (unless you're talking about compilers that auto download software updates!).

Not yet practically implemented, to my knowledge, but yes, the theory is there:

  • Goedel machines: self-referential universal problem solvers making provably optimal self- improvements.

A self improving compiler would, by definition, have to have self modifying code. If you look around, you can find examples of people doing this (self modifying code). However, it's very uncommon to see - especially on projects as large and complex as a compiler. And it's uncommon for the very good reason that it's ridiculously hard (ie, close to impossible) to guarantee correct functionality. A lot of coders who think they're smart (especially Assembly coders) play around with this at one point or another. The ones who actually are smart mostly move out of this phase. ;)

David Cary

In some situations, a C compiler is run several times without any human input, getting a "better" compiler each time. Fortunately (or unfortunately, from another point of view) this process plateaus after a few steps -- further iterations generate exactly the same compiler executable as the last.

  1. We have all the GCC source code, but the only C compiler on this machine available is not GCC. Alas, parts of GCC use "extensions" that can only be built with GCC. Fortunately, this machine does have a functional "make" executable, and some random proprietary C compiler. The human goes to the directory with the GCC source, and manually types "make".

  2. The make utility finds the MAKEFILE, which directs it to run the (proprietary) C compiler to compile GCC, and use the "-D" option so that all the parts of GCC that use "extensions" are #ifdef'ed out. (Those bits of code may be necessary to compile some programs, but not the next stage of GCC.). This produces a very limited cut-down binary executable, that barely has enough functionality to compile GCC (and the people who write GCC code carefully avoid using functionality that this cut-down binary does not support).

  3. The make utility runs that cut-down binary executable with the appropriate option so that all the parts of GCC are compiled in, resulting in a fully-functional (but relatively slow) binary executable.

  4. The make utility runs the fully-functional binary executable on the GCC source code with all the optimization options turned on, resulting in the actual GCC executable that people will use from now on, and installs it in the appropriate location.

  5. The make utility tests to make sure everything is working OK: it runs the GCC executable from the standard location on the GCC source code with all the optimization options turned on. It then compares the resulting binary executable with the GCC executable in the standard location, and confirms that they are identical (with the possible exception of irrelevant timestamps).

After the human types "make", the whole process runs automatically, each stage generating an improved compiler (until it plateaus and generates an identical compiler). http://gcc.gnu.org/wiki/Top-Level_Bootstrap and http://gcc.gnu.org/install/build.html and Compile GCC with Code Sourcery have a few more details.

I've seen other compilers that have many more stages in this process -- but they all require some human input after each stage or two. Example: "Bootstrapping a simple compiler from nothing" by Edmund Grimley Evans 2001 http://homepage.ntlworld.com/edmund.grimley-evans/bcompiler.html And there is all the historical work done by the programmers who have worked on GCC, who use previous versions of GCC to compile and test their speculative ideas on possibly improved versions of GCC. While I wouldn't say this is without any human input, the trend seems to be that compilers do more and more "work" per human keystroke.

I'm not sure if it qualifies, but the Java HotSpot compiler improves code at runtime using statistics.

But at compile time? How will that compiler know what's deficient and what's not? What's the measure of goodness?

There are plenty of examples of people using genetic techniques to pit algorithms against each other to come up with "better" solutions. But these are usually well-understood problems that have a metric.

So what metrics could we apply at compile time? Minimum size of compiled code, cyclometric complexity, or something else? Which of these is meaningful at runtime?

Well, there is JIT (just in time) techniques. One could argue that a compiler with some JIT optimizations might re-adjust itself to be more efficient with the program it is compiling ???

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!