I need to parse a small \'mini language\' which users can type on my site. I was wondering what the counterparts of lex and jacc or antlr are for the world of php.
http://pear.php.net/package/PHP_ParserGenerator
http://wezfurlong.org/blog/2006/nov/parser-and-lexer-generators-for-php
I've ported Jison, a Bison clone in javascript, to php. The results are a killer parser, able to handle very simple and very complex lexing/parsing. It is now part of Jison, but there are a few updates in my fork - https://github.com/robertleeplummerjr/jison . The files are here - https://github.com/robertleeplummerjr/jison/tree/master/ports/php
See the readme in that page, you create a javascript and php parser at the same time that are capable of doing the same or different things. COOL!
I used LIME Parser generator for PHP a couple of years ago, and it was already mature and stable.
The parser generator itself is written in PHP, which doesn't really matter in any technical sense - as we require only that the generated parser be in PHP - but I like this detail nonetheless. It makes me feel less apologetic about writing software in PHP ;-)
EDIT:
I should add:
Where I wrote "used" it would be more accurate to say that I "played with". I haven't written any production code using lime, yet. But I see no reason not to do so.
The "calculator example" provided with lime uses a tokenize() method which is very far from a real substitute for the power of lex. But if you need a real tokenizer it ought to be possible to use lex on the "front end" to feed tokens to lime on the "back end".
I advise you to write your own parser, as it is quite easy today.
The easiest way to do so would be in my opinion to create one class for every syntax type possible (expression, test, loop, etc.).
Then in each class, code the following methods:
a+b
is of type 'expression', if(b)
is not)a+b
will return a->run() + b->run()
, and a->run()
will return a value)