Detecting and skipping line comments with Flex

牧云@^-^@ 提交于 2020-01-10 04:25:47

问题


How can I detect one line comments like "//" in Flex and skip this line?

And Also:

In case of the "/*" comments im using the bellow. Will it work?

   "/*"         { comment(); }
  %%

    comment()
    {
        char c, c1;
    loop:
        while ((c = input()) != '*' && c != 0)
            putchar(c);

        if ((c1 = input()) != '/' && c != 0)
        {
            unput(c1);
            goto loop;
        }

        if (c != 0)
            putchar(c1);
    }

回答1:


Why don't you just use regular expressions to recognize the comments? The whole point of lex/flex is to save you from having to write lexical scanners by hand. The code you present should work (if you put the pattern /* at the beginning of the line), but it's a bit ugly, and it is not obvious that it will work.

Your question says that you want to skip comments, but the code you provide uses putchar() to print the comment, except for the /* at the beginning. Which is it that you want to do? If you want to echo the comments, you can use an ECHO action instead of doing nothing.

Here are the regular expressions:

Single line comment

This one is easy because in lex/flex, . won't match a newline. So the following will match from // to the end of the line, and then do nothing.

"//".*                                    { /* DO NOTHING */ }

Multiline comment

This is a bit trickier, and the fact that * is a regular expression character as well as a key part of the comment marker makes the following regex a bit hard to read. I use [*] as a pattern which recognizes the character *; in flex/lex, you can use "*" instead. Use whichever you find more readable. Essentially, the regular expression matches sequences of characters ending with a (string of) * until it finds one where the next character is a /. In other words, it has the same logic as your C code.

[/][*][^*]*[*]+([^*/][^*]*[*]+)*[/]       { /* DO NOTHING */ }

The above requires the terminating */; an unterminated comment will force the lexer to back up to the beginning of the comment and accept some other token, usually a / division operator. That's likely not what you want, but it's not easy to recover from an unterminated comment since there's no really good way to know where the comment should have ended. Consequently, I recommend adding an error rule:

[/][*][^*]*[*]+([^*/][^*]*[*]+)*[/]       { /* DO NOTHING */ }
[/][*]                                    { fatal_error("Unterminated comment"); }



回答2:


For // you can read until you find the end of line \n or EOF, in case if the comment was at the end of file, for example:

static void
skip_single_line_comment(void)
{
  int c;

  /* Read until we find \n or EOF */
  while((c = input()) != '\n' && c != EOF)
    ;

  /* Maybe you want to place back EOF? */
  if(c == EOF)
    unput(c);
}

as for multiple lines comments /* */, you can read until you see * and peek the next character, if it's / this means this is the end of comment, if not just skip it with any other character. You shouldn't expect EOF, means unclosed comment:

static void
skip_multiple_line_comment(void)
{
  int c;

  for(;;)
  {
    switch(input())
    {
      /* We expect ending the comment first before EOF */
      case EOF:
        fprintf(stderr, "Error unclosed comment, expect */\n");
        exit(-1);
        goto done;
      break;
      /* Is it the end of comment? */
      case '*':
        if((c = input()) == '/')
          goto done;
        unput(c);
        break;
      default:
        /* skip this character */
        break;
    }
  }

done:
  /* exit entry */ ;
}

Complete file:

%{
#include <stdio.h>

static void skip_single_line_comment(void);
static void skip_multiple_line_comment(void);

%}

%option noyywrap

%%
"//"              { puts("short comment was skipped ");
                    skip_single_line_comment();}

"/*"              { puts("long comment begins ");
                    skip_multiple_line_comment();
                    puts("long comment ends");}

" "               { /* empty */ }
[\n|\r\n\t]       { /* empty */ }
.                 { fprintf(stderr, "Tokenizing error: '%c'\n", *yytext);
                    yyterminate(); }
%%

static void
skip_single_line_comment(void)
{
  int c;

  /* Read until we find \n or EOF */
  while((c = input()) != '\n' && c != EOF)
    ;

  /* Maybe you want to place back EOF? */
  if(c == EOF)
    unput(c);
}

static void
skip_multiple_line_comment(void)
{
  int c;

  for(;;)
  {
    switch(input())
    {
      /* We expect ending the comment first before EOF */
      case EOF:
        fprintf(stderr, "Error unclosed comment, expect */\n");
        exit(-1);
        goto done;
      break;
      /* Is it the end of comment? */
      case '*':
        if((c = input()) == '/')
          goto done;
        unput(c);
        break;
      default:
        /* skip this character */
        break;
    }
  }

done:
  /* exit entry */ ;
}

int main(int argc, char **argv)
{
  yylex();
  return 0;
}



回答3:


To detect single line comments :

^"//"    printf("This is a comment line\n");

This says any line which starts with // will be considered as comment line.

To detect multi line comments :

^"/*"[^*]*|[*]*"*/" printf("This is a Multiline Comment\n");

*

Explanation :

*

^"/*" This says beginning should be /*.

[^*]* includes all characters including \n but excludes *.

[*]* says 0 or more number of stars.

[^*]|[*]* - "or" operator is applied to get any string.

"*/" specifies */ as end.

This will work perfectly in lex.

Below is the complete code of lex file :

%{
#include <stdio.h>
int v=0;
%}
%%
^"//"    printf("This is a comment line\n");
^"/*"[^*]*|[*]*"*/" printf("This is a Multiline Comment\n");
.|\n {}
%%
int yywrap()
{
    return 1;
}
main()
{
    yylex();
}


来源:https://stackoverflow.com/questions/25395251/detecting-and-skipping-line-comments-with-flex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!