REPL for interpreter using Flex/Bison

断了今生、忘了曾经 提交于 2019-12-30 03:11:46

问题


I've written an interpreter for a C-like language, using Flex and Bison for the scanner/parser. It's working fine when executing full program files.

Now I'm trying implement a REPL in the interpreter for interactive use. I want it to work like the command line interpreters in Ruby or ML:

  1. Show a prompt
  2. Accept one or more statements on the line
  3. If the expression is incomplete
    1. display a continuation prompt
    2. allow the user to continue entering lines
  4. When the line ends with a complete expression
    1. echo the result of evaluating the last expression
    2. show the main prompt

My grammar starts with a top_level production, which represents a single statement in the language. The lexer is configured for interactive mode on stdin. I am using the same scanner and grammar in both full-file and REPL modes, because there's no semantic difference in the two interfaces.

My main evaluation loop is structured like this.

while (!interpreter.done) {
    if (interpreter.repl)
        printf(prompt);
    int status = yyparse(interpreter);
    if (status) {
        if (interpreter.error)
            report_error(interpreter);
    }
    else {
        if (interpreter.repl)
            puts(interpreter.result);
    }
}            

This works fine except for the prompt and echo logic. If the user enters multiple statements on a line, this loop prints out superfluous prompts and expressions. And if the expression continues on multiple lines, this code doesn't print out continuation prompts. These problems occur because the granularity of the prompt/echo logic is a top_level statement in the grammar, but the line-reading logic is deep in the lexer.

What's the best way to restructure the evaluation loop to handle the REPL prompting and echoing? That is:

  • how can I display one prompt per line
  • how can I display the continuation prompt at the right time
  • how can I tell when a complete expression is the last one on a line

(I'd rather not change the scanner language to pass newline tokens, since that will severely alter the grammar. Modifying YY_INPUT and adding a few actions to the Bison grammar would be fine. Also, I'm using the stock Flex 2.5.35 and Bison 2.3 that ship with Xcode.)


回答1:


After looking at how languages like Python and SML/NJ handle their REPLs, I got a nice one working in my interpreter. Instead of having the prompt/echo logic in the outermost parser driver loop, I put it in the innermost lexer input routine. Actions in the parser and lexer set flags that control the prompting by input routine.

I'm using a reentrant scanner, so yyextra contains the state passed between the layers of the interpreter. It looks roughly like this:

typedef struct Interpreter {
    char* ps1; // prompt to start statement
    char* ps2; // prompt to continue statement
    char* echo; // result of last statement to display
    BOOL eof; // set by the EOF action in the parser
    char* error; // set by the error action in the parser
    BOOL completeLine // managed by yyread
    BOOL atStart; // true before scanner sees printable chars on line
    // ... and various other fields needed by the interpreter
} Interpreter;

The lexer input routine:

size_t yyread(FILE* file, char* buf, size_t max, Interpreter* interpreter)
{
    // Interactive input is signaled by yyin==NULL.
    if (file == NULL) {
        if (interpreter->completeLine) {
            if (interpreter->atStart && interpreter->echo != NULL) {
                fputs(interpreter->echo, stdout);
                fputs("\n", stdout);
                free(interpreter->echo);
                interpreter->echo = NULL;
            }
            fputs(interpreter->atStart ? interpreter->ps1 : interpreter->ps2, stdout);
            fflush(stdout);
        }

        char ibuf[max+1]; // fgets needs an extra byte for \0
        size_t len = 0;
        if (fgets(ibuf, max+1, stdin)) {
            len = strlen(ibuf);
            memcpy(buf, ibuf, len);
            // Show the prompt next time if we've read a full line.
            interpreter->completeLine = (ibuf[len-1] == '\n');
        }
        else if (ferror(stdin)) {
            // TODO: propagate error value
        }
        return len;
    }
    else { // not interactive
        size_t len = fread(buf, 1, max, file);
        if (len == 0 && ferror(file)) {
            // TODO: propagate error value
        }
        return len;
    }
}

The top level interpreter loop becomes:

while (!interpreter->eof) {
    interpreter->atStart = YES;
    int status = yyparse(interpreter);
    if (status) {
        if (interpreter->error)
            report_error(interpreter);
    }
    else {
        exec_statement(interpreter);
        if (interactive)
            interpreter->echo = result_string(interpreter);
    }
}

The Flex file gets these new definitions:

%option extra-type="Interpreter*"

#define YY_INPUT(buf, result, max_size) result = yyread(yyin, buf, max_size, yyextra)

#define YY_USER_ACTION  if (!isspace(*yytext)) { yyextra->atStart = NO; }

The YY_USER_ACTION handles the tricky interplay between tokens in the language grammar and lines of input. My language is like C and ML in that a special character (';') is required to end a statement. In the input stream, that character can either be followed by a newline character to signal end-of-line, or it can be followed by characters that are part of a new statement. The input routine needs to show the main prompt if the only characters scanned since the last end-of-statement are newlines or other whitespace; otherwise it should show the continuation prompt.




回答2:


I too am working on such an interpreter, I haven't gotten to the point of making a REPL yet, so my discussion might be somewhat vague.

Is it acceptable if given a sequence of statements on a single line, only the result of the last expression is printed? Because you can re-factor your top level grammar rule like so:

top_level = top_level statement | statement ;

The output of your top_level then could be a linked list of statements, and interpreter.result would be the evaluation of the tail of this list.



来源:https://stackoverflow.com/questions/6636808/repl-for-interpreter-using-flex-bison

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!