JavaCC quote with escape character

為{幸葍}努か 提交于 2019-12-13 14:55:34

问题


What is the usual way of tokenizing quoted strings that can contain an escape character? Here are some examples:

1) "this is good"
2) "this is\"good\""
3) "this \is good"
4) "this is bad\"
5) "this is \\"bad"
6) "this is bad
7)  this is bad"
8)  this is bad

Below is a sample parser that doesn't work quite right; it has expected results for all except examples 4 and 5, which parse successfully.

options
{
  LOOKAHEAD = 3;
  CHOICE_AMBIGUITY_CHECK = 2;
  OTHER_AMBIGUITY_CHECK = 1;
  STATIC = false;
  DEBUG_PARSER = false;
  DEBUG_LOOKAHEAD = false;
  DEBUG_TOKEN_MANAGER = true;
  ERROR_REPORTING = true;
  JAVA_UNICODE_ESCAPE = false;
  UNICODE_INPUT = false;
  IGNORE_CASE = false;
  USER_TOKEN_MANAGER = false;
  USER_CHAR_STREAM = false;
  BUILD_PARSER = true;
  BUILD_TOKEN_MANAGER = true;
  SANITY_CHECK = true;
  FORCE_LA_CHECK = true;
}

PARSER_BEGIN(MyParser)
import java.io.ByteArrayInputStream;
import java.io.UnsupportedEncodingException;
public class MyParser {
    public static void main(String[] args) throws UnsupportedEncodingException, ParseException{
        //note that this conversion to an input stream is only good for small strings
        MyParser parser = new MyParser(new ByteArrayInputStream(args[0].getBytes("UTF-8")));
        parser.enable_tracing();
        parser.myProduction();
        System.out.println("Must have worked!");
    }
}
PARSER_END(MyParser)

TOKEN:
{
<QUOTED: 
    "\"" 
    (
        "\\" ~[]    //any escaped character
        |           //or
        ~["\""]      //any non-quote character
    )* 
    "\""
>
}


void myProduction() :
{}
{
    <QUOTED>
    <EOF>
}

You can run MyParser from the command line with an input to parse. It will print "must have worked!" if it worked, or throw an error if it didn't.

How do I change this parser to correctly fail on examples 4 and 5?


回答1:


To fix your regular expression, make it

TOKEN: {
<QUOTED: 
    "\"" 
    (
         "\\" ~[]     //any escaped character
    |                 //or
        ~["\"","\\"]  //any character except quote or backslash
    )* 
    "\"" > 
}


来源:https://stackoverflow.com/questions/24156948/javacc-quote-with-escape-character

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!