How do you keep scanner.next() from including newline?

一世执手 提交于 2019-12-08 04:43:16

问题


I am trying to simply read words in a text file using scanner.next() with delimiter equal " " but the scanner includes the newline/carriage return with the token.

I have scoured the internet trying to find a good example of this problem and have not found it so I am posting it here. I can't find another similar problem posted here on SO. I also looked over the documentation on scanner and pattern (http://docs.oracle.com/javase/1.5.0/docs/api/java/util/regex/Pattern.html) but I still cannot find a way to solve this.

Text file:

This is a test

to see if1 this, is working

ok!

Code:

int i = 0;
String string;
try(Scanner scanner = new Scanner(new File(filename))) {
    scanner.useDelimiter(" ");
    while(scanner.hasNext())
    {
    string = scanner.next();
    System.out.println(i++ + ": " + string);
    }
}catch(IOException io_error) {
    System.out.println(io_error);
    }

Output:

0: This

1: is

2: a

3: test

to

4: see

5: if1

6: this,

7: is

8: working

ok!

As you can see, #3 and #8 have two words separated by a newline. (I know I can separate these into two separate strings.)


回答1:


The documentation of Scanner says:

The default whitespace delimiter used by a scanner is as recognized by Character.isWhitespace

And the linked documentation of Character.isWhitespace says:

Determines if the specified character is white space according to Java. A character is a Java whitespace character if and only if it satisfies one of the following criteria:

  • It is a Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR) but is not also a non-breaking space ('\u00A0', '\u2007', '\u202F').
  • It is '\t', U+0009 HORIZONTAL TABULATION.
  • It is '\n', U+000A LINE FEED.
  • It is '\u000B', U+000B VERTICAL TABULATION.
  • It is '\f', U+000C FORM FEED.
  • It is '\r', U+000D CARRIAGE RETURN.
  • It is '\u001C', U+001C FILE SEPARATOR.
  • It is '\u001D', U+001D GROUP SEPARATOR.
  • It is '\u001E', U+001E RECORD SEPARATOR.
  • It is '\u001F', U+001F UNIT SEPARATOR.

So, just don't set any specific delimiter. Keep the default, and newlines will be considered as a delimiter just like spaces, which means the token won't include newline characters.




回答2:


After string = scanner.next(); replace \n that is

string = string.replace("\n", "");

then print out the string variable .. That should do the trick



来源:https://stackoverflow.com/questions/36669643/how-do-you-keep-scanner-next-from-including-newline

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!