Suggestions for writing a programming language? [closed]

↘锁芯ラ 提交于 2019-11-27 07:28:59

Estimating how long something like that might take is dependent on many different factors. For example, an experienced programmer can easily knock out a simple arithmetic expression evaluator in a couple of hours, with unit tests. But a novice programmer may have to learn about parsing techniques, recursive descent, abstract representation of expression trees, tree-walking strategies, and so on. This could easily take weeks or more, just for arithmetic expressions.

However, don't let that discourage you. As Jeff and Joel were discussing with Eric Sink on a recent Stack Overflow podcast, writing a compiler is an excellent way to learn about many different aspects of programming. I've built a few compilers and they are among my most memorable programming projects.

Some classic books on building compilers are:

Dave Hanson, who with Chris Fraser spent 10 years building one of the world's most carefully crafted compilers, told me once that one of the main things he learned from the experience was not to try to write a compiler in C or C++.

If you want to develop something quickly, don't generate native code; target an existing virtual machine such as the CLR, JVM, or the Lua virtual machine. Generate code using maximal munch.

Another good option if you're writing an interpreter is just to use the memory management and other facilities of your underlying programming language. Parse to an AST and then interpret by tree walk of the AST. This will get you off the ground fast. Performance is not the greatest, but it's acceptable. (Using this technique I once wrote a PostScript interpreter in Modula-3. The first implementation took a week and although it later underwent some performance tuning, primarily in the lexer, it never had to be replaced.)

Avoid LALR parser generators; use something that saves your time, like ANTLR or the Elkhound GLR parser generator.

Cruachan

The classic books on compiler design are

"Principles of Compiler Design" by Alfred V. Aho and Jeffrey D. Ullman. It's been around quite some time now and its pink knight and green dragon are well known to at least a couple of generations of CS students.

Also...

"Compilers: Principles, Techniques, and Tools" by Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman

If you're interested in writing a compiler then these are undoubtedly the best places to start.

As a person who knows C++ very well, what tips can you give a person who is looking to write a programming or script language?

Don't do it. (or at least think long and hard before you do!)

If you're trying to write a scripting language to expose the methods/properties of some custom-written objects, it would be better to implement those in Java (or .NET/VB or all those icky Microsoftisms) and then use one of the Bean Scripting Framework languages as your scripting language. (with whatever the equivalent is on the Microsoft end.)

Any questions about compilers will have an answer "go read dragon book, read that book, this book..." on SO regardless of their content in a few minutes. So I skip that part (like I was telling in the first place). Reading these books to learn how to use the tools you want, is about as useful as reading about angular momentum to learn how to ride a bike.

So, to answer what you asked, without questioning your intention, I can easily recommend antlr and antlrworks for starters. You can generate your AST easily (where the real magic happens, I think) and debug your grammar visually. It generates a good portion of a working compiler for you.

If you know your stuff and want to have more control or don't like antlr, you can use lemon parser generator and ragel state machine compiler (have special support for lexing) together.

If you don't need too much performance and since you plan to generate C/C++ code, you can skip doing any optimizations yourself and leave that stuff to your C/C++ compiler.

If you can live with a slow runtime, you can further shorten your development effort just doing interpretation, since it is often easier to implement dynamic features this way.

I think everybody is missing one very important point.

WHY do you want to write a compiler / interpreter / parser etc.

This will seriously determine a lot of what you do.

I have worked on quite a few language implementations, some rather weird, some domain specific, some simply scripted progress through command environments (often where the command environment was later hidden). Each required different levels of skill.

Many books available. One I loved was a BYTE book : Threaded Interpreted Languages - bet it's out of print.

Simple script engines can be crafted with a few evening's thinking and a bit of trial and error.

But I bet there are online courses now that will save you a ton of time.

I'd strongly recommend looking at existing bytecode interpreters. If you can make your language fit into CIL (.NET) or Java (or even others such as Python or Parrot), you'll save yourself all the effort of making a workable supporting environment and can get on with experimenting with language concepts.

If you're planning on writing an interpreter or compiler, don't do it because you want to write the next big thing. Write it because you already have a purpose for it in mind or to learn. If you do this you may find that you've accidentally written the next big thing.

A good tool that I've used for LALR is the GOLD Parsing System. It's free, the grammer is Backus-Naur Form, and there are multiple examples, including engines written in C#, VB.NET, Java and others. This lets you write a grammer, compile the grammer to a file, and then use an engine to parse the grammer.

As recommended above, I would recommend targeting a byte code of some kind, like IL. This will allow you to leverage the enormous amounts of existing frameworks.

Good Luck

If you don't want to get into writing a compiler to reduce your language to assembly/machine, then your next option is to write a compiler to a byte-code language virtual machine, such as the JVM, PVM or .NET.

Of course, if you don't even want to do that - you just want to create your own "domain specific language", I would build it in Common Lisp. Lisp macros provide a rather straight-forward method of creating whatever syntax you want and parsing it into Lisp. And you don't have worry about byte-code or assembly. Of course, you need to learn Lisp.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!