Phases of a compiler?

丶灬走出姿态 提交于 2019-12-22 08:10:58

问题


At which phase of the compilation are keywords of a programming language recognized?

I am sort of confused between

  1. The lexical analysis.
  2. The parsing of the program.

I once wrote a lexer in C using regular expressions but it recognised the main() in int main(void) also as a keyword.

On these lines I think that we have to build a parse tree to recognize keywords.


回答1:


I had to build a simple compiler this year as a project for which i used Java . The recognition of keywords was made on lexical analysis. During that phase i would read input language and create a token with a type(for keywords type was variable_declaration ) and its value.I also had different types for each case like identifier , constant,multiply operation ,adding operation etc.Then passed these tokens to a queue and next into a parser which would check grammar and create a binary tree which was then used to create the output language .




回答2:


Typically, the lexical analysis phase of compilation breaks the input text up into sequences of lexemes that each belongs to some particular token type that's useful in later analysis. Consequently, keywords are usually first recognized during lexical analysis in order to make parsing easier. Since parsers tend to be implemented by writing context-free grammars of tokens rather than of lexemes (that is, the category of the lexeme rather than the contents of the lexeme), it is significantly easier to build a parser when keywords are marked during lexing. For example, if I want to have a parser that treats "if" as a keyword, then I may want a rule that looks something like this in my CFG:

Statement ::= 'IF' Expr 'THEN' Expr

If I don't categorize IF and THEN into their own token types, then my parser couldn't write a statement like the above.




回答3:


It depends very much of the definition, specifically where you draw the lines between scanner, tokenizer, lexer, and parser. Since this is homework, and it is only correct if your prof. says its correct: take a look at the definitions that were supplied in your reading-material.

Regarding main(): You can definitely say that main(), as all other functions, is not a keyword, it is however a token. The tokenizer recognizes that the substring "main" is one token, the parser sets it in relation to it's "(...)" and "{...}" parts. Further, for main() the parser will automatically generate a program entry-point.




回答4:


That would be lexical analysis.

Some languages have "special" identifiers as well as keywords. These are often added to the identifier table and allocated known constant ID values before the parsing starts, so that they can be spotted easily. These don't normally have special meaning to the parser, though - they should be detected in the abstract syntax tree (AST) after parsing.

For example, take a look at the Oberon language report...

http://www-old.oberon.ethz.ch/oreport.html

Not a language recommendation - just an easily available and simple language specification (very much Wirths style).

Anyway, the "Vocabulary and representation" section includes a list of "operators and delimiters", including what most people would recognise as keywords. These would be recognised by the lexical analyser.

In the "Declarations and scope rules" section, there's a list of predefined identifiers such as "ABS" and "BOOLEAN". I'm not familiar enough with Oberon to be certain, but if I were to write a compiler, there's a good chance I'd just pre-initialise the normal identifier table to include these predefined identifiers.

In C, "main" is in most ways just another function. The compiler may or may not treat it as special. It's possible that the only "special" thing about it is that the startup code that gets linked into your final executable makes a call to that function.




回答5:


Traditionally, keywords are recognised by a lexer (which leaves you with a language made of a fixed set of keywords). But of course you can do it during a parsing pass. You can even get rid of your lexer entirely, using one of the numerous lexerless parsing techniques (e.g., PEGs). It might help you to avoid confusion.



来源:https://stackoverflow.com/questions/5202709/phases-of-a-compiler

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!