flex-lexer

Why are multi-line comments in flex/bison so evasive?

╄→гoц情女王★ 提交于 2019-12-06 18:08:13
问题 I'm trying to parse C-style multi-line comments in my flex (.l) file: %s ML_COMMENT %% ... <INITIAL>"/*" BEGIN(ML_COMMENT); <ML_COMMENT>"*/" BEGIN(INITIAL); <ML_COMMENT>[.\n]+ { } I'm not returning any token and my grammar (.y) doesn't address comments in any way. When I run my executable, I get a parse error: $ ./a.out /* abc def Parse error: parse error $ echo "/* foo */" | ./a.out Parse error: parse error (My yyerror function does a printf("Parse error: %s\n"), which is where the first

Making bison/flex parser reentrant with integral YYSTYPE

自作多情 提交于 2019-12-06 12:18:58
问题 I'm having trouble following the steps to make my bison/flex parser reentrant with a minimum amount of fuss. The problem appears to be in the lexer. Since everything parser is re-entrant, I can no longer assign yylval directly. Instead, according to the Flex manual, I have to call this function: void yyset_lval ( YYSTYPE * yylvalp , yyscan_t scanner ); But the problem is, YYSTYPE is an integral type. It isn't a dynamically allocated value, and it isn't an lvalue at all, so I can't pass a

Integrating Flex/Bison with external program

廉价感情. 提交于 2019-12-06 11:37:11
I'm working on an intelligent agent model that requires, as input, a list of events. The events come from the output of another model and are in a (large) text file. The text file is a list of all events (including unnecessary events that I don't care about), so I've written a scanner using flex that can find the useful bits. The framework for the intelligent agent model is already written in C++. Each event is timestamped and contains a large amount of information about the event. The format of the input file is constant, so I really have no need to check the syntax. I don't know if Bison

Is there a way to change the flex start state from bison?

混江龙づ霸主 提交于 2019-12-06 09:14:58
问题 I have defined different states in my lexer, which change not depending on the token but on a sequence of tokens (similarly to how template engines work). I can define longer tokens but I somehow like this approach better. 回答1: You can stick a function in the third section of the .l file that uses the BEGIN macro, and then call that function from your bison action (or anywhere else for that matter). You need to be careful of the fact that bison may read ahead a token before reducing a rule

bison/flex: Peek at the next letter or token

微笑、不失礼 提交于 2019-12-06 08:48:59
问题 When dealing with strings (it has its own state like comments) i need to find out if the next letter is a " or not. If it is i dont end the string state. So what happens is i just dont end the string in my string state (i use <STRING_STATE>. and process it letter by letter). So what happens is, i mark if the last string was " and if the current isnt i exit the state and unput the last letter. This has a weird effect. When i get errors on lines with strings i see the letter (usually a ',' or '

How to get string value of token in flex and bison?

99封情书 提交于 2019-12-06 05:50:10
问题 I have this token in my .lex file: [a-zA-Z0-9]+ { yylval = yytext; return ALPHANUM; } and this code in my .y file: Sentence: "Sphere(" ALPHANUM ")." { FILE* file = fopen("C:/test.txt", "a+"); char st1[] = "polySphere -name "; strcat(st1, $2); strcat(st1, ";"); fprintf(file,"%s", st1); fclose(file); } I get this error when I try to compile: warning: passing argument 2 of ‘strcat’ makes pointer from integer without a cast So $2 is an int, how do I make it a string? For example: "Sphere

Why does Flex say this is an “unrecognized rule”?

坚强是说给别人听的谎言 提交于 2019-12-06 05:38:50
In the following: space ([ \t\f\r])+ opt_space ([ \t\f\r])* cpp ^{opt_space}#{opt_space} word [A-Za-z_][A-Za-z_0-9]* arg_macro {cpp}define{space}{word} /*arg_macro ^{opt_space}#{opt_space}define{space}{word}*/ %% {arg_macro} ; %% I get an error message test.l:9: unrecognized rule If I uncomment the second version of arg_macro and comment the first one, the error message goes away. Any ideas why? If you remove the ^ from the cpp definition, and place it in the arg_macro definition, then it's happy. space ([ \t\f\r])+ opt_space ([ \t\f\r])* cpp {opt_space}#{opt_space} word [A-Za-z_][A-Za-z_0-9]*

Where to free memory in Bison/Flex?

跟風遠走 提交于 2019-12-06 04:47:17
问题 I'm using Bison & Flex for 1 month more or less, so I'm sorry if I don't see something obvious (but I don't think it is). I have a problem about freeing memory with Flex Bison. Here is what my code looks like: parser.l {DATE} { yylval.str= strdup(yytext); pair<string,string> newpair = make_pair("DATE",yytext); myvector.push_back(newpair); return TOKEN_DATE ;} This is one of the example of my .l file. I copy the value of yytext into yylval.str. Then I create a new pair with that content (key

Why is yylval null?

拈花ヽ惹草 提交于 2019-12-06 01:00:11
I'm trying to write my first parser with Flex & Bison. When parsing numbers, I'm trying to save their values into the yylval structure. The problem is, yylval is null when the lexer reaches a number, which causes a segmentation fault. (Related point of confusion: why is it that in most Flex examples (e.g. here ), yylval is a structure, rather than a pointer to a structure? I couldn't get yylval to be recognized in test.l without %option bison-bridge , and that option made yylval a pointer. Also, I tried initializing yylval in main of test.y, but yylval = malloc(...) gives a type mismatch-- as

Flex rule with a period “.” is not compiling

人走茶凉 提交于 2019-12-05 17:56:57
I am facing a problem compiling this regular expression with flex "on"[ \t\r]*[.\n]{0,300}"."[ \t\r]*[.\n]{0,300}"from" {counter++;} I had 100 hundred rules in rules section of flex specification file. I tried to compile it flex -Ce -Ca rule.flex I waited for 10 hours still it didn't complete so I killed it. I started to find the issue and narrowed down the problem to this rule. If I remove this rule from 100 rules, it takes 21 seconds to compile it to C code. If I replace the period with some other character it compiles successfully. e.g. "on"[ \t\r]*[.\n]{0,300}"A"[ \t\r]*[.\n]{0,300}"from"