Parsing grammars using OCaml

前端 未结 3 1294
余生分开走
余生分开走 2020-12-28 10:33

I have a task to write a (toy) parser for a (toy) grammar using OCaml and not sure how to start (and proceed with) this problem.

Here\'s a sample Awk grammar:

<
3条回答
  •  有刺的猬
    2020-12-28 11:06

    Ok, so the first think you should do is write a lexical analyser. That's the function that takes the ‘raw’ input, like ["3"; "-"; "("; "4"; "+"; "2"; ")"], and splits it into a list of tokens (that is, representations of terminal symbols).

    You can define a token to be

    type token =
        | TokInt of int         (* an integer *)
        | TokBinOp of binop     (* a binary operator *)
        | TokOParen             (* an opening parenthesis *) 
        | TokCParen             (* a closing parenthesis *)     
    and binop = Plus | Minus 
    

    The type of the lexer function would be string list -> token list and the ouput of

    lexer ["3"; "-"; "("; "4"; "+"; "2"; ")"]
    

    would be something like

    [   TokInt 3; TokBinOp Minus; TokOParen; TokInt 4;
        TBinOp Plus; TokInt 2; TokCParen   ]
    

    This will make the job of writing the parser easier, because you won't have to worry about recognising what is a integer, what is an operator, etc.

    This is a first, not too difficult step because the tokens are already separated. All the lexer has to do is identify them.

    When this is done, you can write a more realistic lexical analyser, of type string -> token list, that takes a actual raw input, such as "3-(4+2)" and turns it into a token list.

提交回复
热议问题