Syntax Highlighting: How Does Eclipse Do It So Fast?

前端 未结 1 1896
有刺的猬
有刺的猬 2021-02-06 02:59

I\'ve developed a syntax highlighter in Java for Android and it\'s working well, but the problem is it can be slow with big files.

So I\'m wondering how source code edit

相关标签:
1条回答
  • 2021-02-06 03:52

    I cannot talk for Gedit, but in Eclipse, we cheat :-)

    If you look very carefully, you can actually see that syntax coloring for structured languages like Java is a two-phase process.

    First, a presentation reconciler is run to do very basic syntax coloring. This is done immediately triggered on changes in the document of the editor and is expected to be extremely fast. It is really not syntax-based coloring, but actually lexically-based coloring. So the focus is on tokens like strings, keywords, words, numbers, comments, etc - all tokens that can be recognized easily based on simple character tables or similar. Thus there are no difference between a class name, a variable name or a static method name, even though they may be colored different in the end. For many languages, this is the only coloring done.

    Next, a syntax reconciler is run to build an abstract syntax tree (AST) for the document - or as near as you can get in the face of syntax errors or semantic errors. This is triggered by a timer and for some languages an attempt is made to just do a partial update of the AST (not easy). The completed AST is then used to update the outline view and then do additional syntax coloring based on the additional information - e.g. static method name. (The AST is often used for many other things, like hover information, folding, hyperlinking, etc.

    Both for the initial presentation reconciler and the later syntax based reconciler, some rather elaborate logic determines just how big a region of the document that must be parsed. For the presentation reconciler the decision can be based on any existing coloring, whereas for the syntax based coloring a separate damage/repair phase in run to determine the size of the region.

    Some extreme examples that always complicate matters are when block comments are added or removed

    a = b /* c + 1 /* remember the offset! */;
    

    If the first slash is removed or added, the presentation reconciler must process a larger area, than what can be naively expected...

    0 讨论(0)
提交回复
热议问题