Java scanner not going through entire file

后端未结

关注

 8  1061

I\'m writing a program in Java and one of the things that I need to do is to create a set of every valid location for a shortest path problem. The locations are defined in a

相关标签:

8条回答

忘掉有多难

2020-11-30 11:43

I was having the same problem. The scanner would not read to the end of a file, actually stopping right in the middle of a word. I thought it was a problem with some limit set on the scanner, but I took note of the comment from rfeak about character encoding.

I re-saved the .txt I was reading into UTF-8, it solved the problem. It turns out that Notepad had defaulted to ANSI.

0 讨论(0)
发布评论:

提交评论
- 加载中...

太阳男子

2020-11-30 11:51

There's a problem with Scanner reading your file but I'm not sure what it is. It mistakenly believes that it's reached the end of file when it has not, possibly due to some funky String encoding. Try using a BufferedReader object that wraps a FileReader object instead.

e.g.,

   private static Set<String> posible2(String posLoc) {
      Set<String> result = new TreeSet<String>();
      BufferedReader br = null;
      try {
         br = new BufferedReader(new FileReader(new File(posLoc)));
         String availalbe;
         while((availalbe = br.readLine()) != null) {
             result.add(availalbe);            
         }
      } catch (FileNotFoundException e) {
         e.printStackTrace();
      } catch (IOException e) {
         e.printStackTrace();
      } finally {
         if (br != null) {
            try {
               br.close();
            } catch (IOException e) {
               e.printStackTrace();
            }
         }
      }
      return result;
  }

Edit
I tried reducing your problem to its bare minimum, and just this was enough to elicit the problem:

   public static void main(String[] args) {
      try {
         Scanner scanner = new Scanner(new File(FILE_POS));
         int count = 0;
         while (scanner.hasNextLine()) {
            String line = scanner.nextLine();
            System.out.printf("%3d: %s %n", count, line );
            count++;
         }

I checked the Scanner object with a printf:

System.out.printf("Str: %-35s size%5d; Has next line? %b%n", availalbe, result.size(), s.hasNextLine());

and showed that it thought that the file had ended. I was in the process of progressively deleting lines from the data to file to see which line(s) caused the problem, but will leave that to you.

0 讨论(0)

轮回少年

2020-11-30 11:54
My case:
- in my main program (A) it always reads 16384 bytes from a 41021 byte file. The character where it stops is in the middle of a line with normal printable text
- if I create a small separate program (B) with only the Scanner and print lines, it reads the whole file
- specifying "UTF-8" in (A) still reads 16384
- specifying "ASCII" in (A) still reads 16384
- specifying "Cp1252" in (A) reads the whole file
- my input txt files are sent by users and I can't be sure that they will write them in any particular encoding
Conclusions
- Scanner seems to read the file block by block and writes the correctly read data into the return String, but when it finds a block with a different encoding than it is expecting, it exits silently (ouch) and returns the partial string
- the txt file I'm trying to read is Cp1252, my (A) source file is UTF-8 and my (B) source file is Cp1252 so that's why (B) worked without specifying an encoding
Solution
- forget about Scanner and use
String fullFileContents = new String(Files.readAllBytes(myFile.toPath()));

Of course, non-ascii characters can't be reliably read like this as you don't know the encoding, but the ascii characters will be read for sure. Use it if you only need the ascii characters in the file and the non-ascii part can be discarded.
0 讨论(0)
发布评论:

提交评论
- 加载中...
无人及你

2020-11-30 11:56

I also had similar issue on my Linux server and finally below code worked for me.

Scanner scanner = new Scanner(new File(filename),"UTF-8");

0 讨论(0)
发布评论:

提交评论
- 加载中...
既然无缘

2020-11-30 11:56

I had the same problem with a csv file: it worked on Windows but it didn't work on Linux

Open file with nodepad++ and change encodage, choose : Encode in UTF8 (with BOM). It solved problem in my case

0 讨论(0)
发布评论:

提交评论
- 加载中...
轮回少年

2020-11-30 12:03

I had a txt file in which Scanner stopped reading at line 862, it was a weird problem. What I did was creating a different file (to try to replicate the problem). I added it less than 862 lines first, then I added more than 862 and it worked fine.

So I believe that the problem was that on my previous file, at line 862, there was something wrong, like some character or symbol that could have misled Scanner to finish reading early.

In conclusion: based on this experience I recommend finding out the exact line where scanner stops reading to find a solution for kind of problems.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页