How do I write a non-greedy match in LEX / FLEX?

ぐ巨炮叔叔 提交于 2019-12-10 01:21:42

问题


I'm trying to parse a legacy language (which is similar to 'C') using FLEX and BISON. Everything is working nicely except for matching strings.

This rather odd legacy language doesn't support quoting characters in string literals, so the following are all valid string literals:

"hello"
""
"\"

I'm using the following rule to match string literals:

\".*\"            { yylval.strval = _strdup( yytext ); return LIT_STRING; }

Unfortunately this is a greedy match, so it matches code like the following:

"hello", "world"

As a single string (hello", "world).

The usual non-greedy quantifier .*? doesn't seem to work in FLEX. Any ideas?


回答1:


Just prohibit having a quote in between the quotes.

\"[^"]*\"



回答2:


Backslash escaped quotes

The following also allows it:

\"(\\.|[^\n"\\])*\" {
        fprintf( yyout, "STRING: %s\n", yytext );
    }

and disallows for newlines inside of string constants.

E.g.:

>>> "a\"b""c\d"""
STRING: "a\"b"
STRING: "c\d"
STRING: ""

and fails on:

>>> "\"

When implementing such C-like features, make sure to look for existing Lex implementations, e.g.: http://www.lysator.liu.se/c/ANSI-C-grammar-l.html



来源:https://stackoverflow.com/questions/4166194/how-do-i-write-a-non-greedy-match-in-lex-flex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!