Feed ocamlyacc parser from explicit token list?

主宰稳场 提交于 2020-01-03 16:54:55

问题


Is it possible to feed an OCamlYacc-generated parser an explicit token list for analysis?

I'd like to use OCamlLex to explicitly generate a token list which I then analyze using a Yacc-generated parser later. However, the standard use case generates a parser that calls a lexer implicitly for the next token. Here tokens are computed during the yacc analysis rather than before. Conceptually a parser should only work on tokens but a Yacc-generated parser provides an interface that relies on a lexer which in my case I don't need.


回答1:


If you already have a list of tokens, you can just go the ugly way and ignore the lexing buffer altogether. After all, the parse-from-lexbuf function that your parser expects is a non-pure function :

let my_tokens = ref [ (* WHATEVER *) ]
let token lexbuf = 
  match !my_tokens with 
    | []     -> EOF 
    | h :: t -> my_tokens := t ; h 

let ast = Parser.parse token (Lexbuf.from_string "")

On the other hand, it looks from your comments that you actually have a function of type Lexing.lexbuf -> token list that you're trying to fit into the Lexing.lexbuf -> token signature of your parser. If that is the case, you can easily use a queue to write a converter between the two types:

let deflate token = 
  let q = Queue.create () in
  fun lexbuf -> 
    if not (Queue.is_empty q) then Queue.pop q else   
      match token lexbuf with 
        | [   ] -> EOF 
        | [tok] -> tok
        | hd::t -> List.iter (fun tok -> Queue.add tok q) t ; hd 

let ast = Parser.parse (deflate my_lexer) lexbuf



回答2:


As already mentioned by Jeffrey, Menhir specifically offers, as part of its runtime library, a module to the parsers with any kind of token stream (it just asks for a unit -> token function): MenhirLib.Convert.

(You could even use this code without using Menhir, with ocamlyacc instead. In practice the conversion is not terribly complicated so you could even re-implement it yourself.)




回答3:


The OCamlYacc interface does look pretty complicated; it seems to require a Lexing.lexbuf. Maybe you could consider using Lexing.from_string to feed a fixed string rather than a fixed sequence of tokens. You could also look at Menhir. I haven't used it, but it gets excellent reviews here whenever anybody mentions OCaml parser generators. It might have a more flexible lexing interface.



来源:https://stackoverflow.com/questions/10899544/feed-ocamlyacc-parser-from-explicit-token-list

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!