问题
Is it possible to feed an OCamlYacc-generated parser an explicit token list for analysis?
I'd like to use OCamlLex to explicitly generate a token list which I then analyze using a Yacc-generated parser later. However, the standard use case generates a parser that calls a lexer implicitly for the next token. Here tokens are computed during the yacc analysis rather than before. Conceptually a parser should only work on tokens but a Yacc-generated parser provides an interface that relies on a lexer which in my case I don't need.
回答1:
If you already have a list of tokens, you can just go the ugly way and ignore the lexing buffer altogether. After all, the parse-from-lexbuf function that your parser expects is a non-pure function :
let my_tokens = ref [ (* WHATEVER *) ]
let token lexbuf =
match !my_tokens with
| [] -> EOF
| h :: t -> my_tokens := t ; h
let ast = Parser.parse token (Lexbuf.from_string "")
On the other hand, it looks from your comments that you actually have a function of type Lexing.lexbuf -> token list
that you're trying to fit into the Lexing.lexbuf -> token
signature of your parser. If that is the case, you can easily use a queue to write a converter between the two types:
let deflate token =
let q = Queue.create () in
fun lexbuf ->
if not (Queue.is_empty q) then Queue.pop q else
match token lexbuf with
| [ ] -> EOF
| [tok] -> tok
| hd::t -> List.iter (fun tok -> Queue.add tok q) t ; hd
let ast = Parser.parse (deflate my_lexer) lexbuf
回答2:
As already mentioned by Jeffrey, Menhir specifically offers, as part of its runtime library, a module to the parsers with any kind of token stream (it just asks for a unit -> token
function): MenhirLib.Convert.
(You could even use this code without using Menhir, with ocamlyacc instead. In practice the conversion is not terribly complicated so you could even re-implement it yourself.)
回答3:
The OCamlYacc interface does look pretty complicated; it seems to require a Lexing.lexbuf
. Maybe you could consider using Lexing.from_string
to feed a fixed string rather than a fixed sequence of tokens. You could also look at Menhir. I haven't used it, but it gets excellent reviews here whenever anybody mentions OCaml parser generators. It might have a more flexible lexing interface.
来源:https://stackoverflow.com/questions/10899544/feed-ocamlyacc-parser-from-explicit-token-list