Lex regex gets some extra characters

后端 未结 1 1313
情书的邮戳
情书的邮戳 2021-01-26 08:50

I have the following definition in my lex file:

L   [a-zA-Z_]                                           
A   [a-zA-Z_0-9] 
%%
{L}{A}*                 { yylval.id         


        
相关标签:
1条回答
  • 2021-01-26 09:31

    What am I doing wrong?

    You're assuming that the string pointed to by yytext is constant. It is not.

    The lifetime of the string pointed to by yytext is the lexical action of the associated rule. If that rule ends up returning, yytext will survive until the next time yylex is called. And that's it.

    bison-generated parsers have a one-symbol lookahead. So by the time the parser executes a semantic action, yylex has been called again (for the lookahead); consequently, you can't use the saved value of yytext even for the last (or only) token in a rule.

    Solution: copy the string. (I use strdup, but for whatever reason some people like to malloc and strcpy. If you do, don't forget about the NUL terminator.) And remember to free() the copy when you're done with it.

    For reference: what the flex manual says.

    0 讨论(0)
提交回复
热议问题