java StreamTokenizer

让人想犯罪 __ 提交于 2019-12-13 03:23:22

问题


I'm using the method quoteChar('"') to treat the strings. The usual escape sequences such as "\n" and "\t" are recognized and converted to single characters as the string is parsed. Is there any way to get the string just the way it is, meaning that if i have the string:

Hello\tworld

i want to get

Hello\tworld

and not:

Hello world

. Thanks


回答1:


Looking at the StreamTokenizer source, it looks like the escape behavior for strings is hard-coded. I can only think of a few ways to get around it:

  1. Re-escape the string once you get it back. The problem here is that this won't match exactly what was in the file - \t will be converted back but \040 will not.
  2. Insert your own Reader in between the source Reader and the StreamTokenizer. Store all the chars read for the last token in a buffer. Trim whitespace from the start of that buffer to get the "raw" token.
  3. If your tokenizing rules are simple enough, implement your own tokenizer.



回答2:


That what worked for me:

public class MyReader extends BufferedReader {
    // You can choose whatever replacement you'd like(one wont occur in your text)
    private static final char TAB_REPLACEMENT = '\u0000';

    public MyReader(Reader in) {
        super(in);
    }

    @Override
    public int read() throws IOException {
        int charVal = super.read();
        if (charVal == '\t') {
            return TAB_REPLACEMENT;
        }
        return charVal;
    }
}

and then create the tokenizer by:

myTokenizer = new StreamTokenizer(new MyReader(new FileReader(file)));

and get the new strval by

MyTokenizer.sval.replace(TAB_REPLACEMENT, '\t')


来源:https://stackoverflow.com/questions/8873672/java-streamtokenizer

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!