问题
I am trying to locate the root cause of an issue. I have the following line that needs to be parsed -
sample format "string";
Where sample
and format
need to be tokenized and whatever is in the inverted commas needs to be provided to the Parser file.
There is a catch however, if I have a perl style comment #
inside the string, then I get an error.
In the lexer.l
, I have the following -
stringIdentifier [^"]+
<STRING_S>{stringIdentifier} {
strncpy(yylval.str, yytext,1023);
yylval.str[1023] = '\0';
return IDENTIFIER;
}
<*>"//".* {
}
<*>"#".* {
}
<INITIAL>{s}{a}{m}{p}{l}{e} {
BEGIN(SAMPLE_S);
return SAMPLE;
}
<SAMPLE_S>{f}{o}{r}{m}{a}{t} {
return FORMAT;
}
<SAMPLE_S>"\"" {
BEGIN(STRING_S);
return INVERTED_COMMA;
}
<STRING_S>"\"" {
BEGIN(INITIAL);
return INVERTED_COMMA;
}
In the Parser.y
I have the following rule:
pass : SAMPLE FORMAT INVERTED_COMMA IDENTIFIER INVERTED_COMMA
{
};
However, when I give sample format "abc;"
it works, however, when I add a comment character #
in the string it fails. Could you please help with this
回答1:
The answer lies in the way you have used the default start conditions. A quick read of the lex/flex manual explains their operation.
The <*>
means that the following pattern is applied in every state. This includes inside a string, which is indicated by the S_STRING
state. To stop the comment pattern operating inside the string you need to exclude the S_STRING
state from <*>
. You can do this by listing all the other applicable states, which enumerate, in your example, to <INITIAL,S_SAMPLE>
. The comment rules then become:
<INITIAL,SAMPLE_S>"//".* {
}
<INITIAL,SAMPLE_S>"#".* {
}
And that's it. It now works! (I have tested it BTW)
来源:https://stackoverflow.com/questions/20532346/lex-and-yacc-issue-with-comments