What is the best way to define grammars for a text editor?

心已入冬 提交于 2019-12-10 20:13:42

问题


I'm masochistically writing an open-source text editor for Mac and have finally reached the point at which I want to add syntax highlighting. I've been going back and forth on various solutions for the past few days, and I've finally decided to open the question to a wider audience.

Here are the options I see:

  • Define languages basically with a series of regex pattern matching (similar to how TextMate defines its languages)
  • Define languages with a formal grammar like BNF or PEG

Using regex pattern matching seems less than ideal as it cannot formally represent a language nearly as well as a formal grammar; however, some less formal languages will have a hard time fitting into BNF (i.e. Markdown -- though I know there's a great PEG implementation).

What are the performance tradeoffs for live syntax highlighting? What about flexibility for a wide range of languages?

If I go the BNF route, Todd Ditchendorf created the awesome ParseKit framework which would work nicely out-of-the-box. Anyone know of any anything similar for PEG's?


回答1:


Unless you want to fight the battle of getting a full-context free (or worse, a full context-sensitive) grammar completely correct for every language you want to process (or worse, for every dialect of the language you want to process... how many kinds of C++ are there?), for the purposes of syntax highlighting you're probably better giving up on complete correctness and accept that sometimes you'll get it wrong. In that case, regexps seem like an extremely good answer. They can also be very fast, so they won't interfere with the person doing the editing.

If you insist on doing full syntax checking/completion (I don't think you are), then you'll need that full grammar. You'll also be a very long time in producing editors for real languages.

Sometimes it is better not to be too serious. A 98% solution that you can get is better than a 100% solution that never materializes.




回答2:


It might not be exactly what you need since you are writing the editor yourself, but there is an awesome framework called Xtext that will actually generate a complete editor with syntax coloring, customizable outline view and auto-completion etc., based on a grammar for your language: http://eclipse.org/Xtext




回答3:


In addition to the problems of getting a grammar to work for a language, there is the added complexity of trying to get it to work for code that is in the middle of being edited.



来源:https://stackoverflow.com/questions/4170180/what-is-the-best-way-to-define-grammars-for-a-text-editor

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!