NumberFormatException while selecting random elements from a big file

我只是一个虾纸丫 提交于 2019-12-12 02:54:57

问题


I have a very big file which contains user ids like this. Each line in that big file is an user id.

149905320
1165665384
66969324
886633368
1145241312
286585320
1008665352

So in that big file, I will have around 30Million user id's. Now I am trying to select random user id's from that big file. Below is the program I have but at some point it always give me this exception like this- and I am not sure why this exception is happening.

Exception in thread "main" java.lang.NumberFormatException: For input string: ""
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:59)
    at java.lang.Integer.parseInt(Integer.java:481)
    at java.lang.Integer.parseInt(Integer.java:510)
    at com.host.bulls.service.lnp.RandomReadFromFile.main(RandomReadFromFile.java:65)

Below is the program I have-

public static void main(String[] args) throws Exception {

    File f = new File("D:/abc.txt");
    RandomAccessFile file;

    try {

        file = new RandomAccessFile(f, "r");
        long file_size = file.length();

        // Let's start
        long chosen_byte = (long)(Math.random() * (file_size - 1));
        long cur_byte = chosen_byte;

        // Goto starting position
        file.seek(cur_byte);

        String s_LR = "";
        char a_char;

        // Get left hand chars
        for (;;)
        {
            a_char = (char)file.readByte();
            if (cur_byte < 0 || a_char == '\n' || a_char == '\r' || a_char == -1) break;
            else 
            {
                s_LR = a_char + s_LR;
                --cur_byte;
                if (cur_byte >= 0) file.seek(cur_byte);
                else break;
            }
        }

        // Get right hand chars
        cur_byte = chosen_byte + 1;
        file.seek(cur_byte);
        for (;;)
        {
            a_char = (char)file.readByte();
            if (cur_byte >= file_size || a_char == '\n' || a_char == '\r' || a_char == -1) break;
            else 
            {
                s_LR += a_char;
                ++cur_byte;
            }
        }

        // Parse ID
        if (cur_byte < file_size) 
        {
            int chosen_id = Integer.parseInt(s_LR);
            System.out.println("Chosen id : " + chosen_id);
        }
        else
        {
            throw new Exception("Ran out of bounds..");
        }

    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

Is there any problem in my above code?


回答1:


I tried to run your code and found one additional error - you have to check cur_byte before read as follows:

if (cur_byte < file_size) {
    a_char = (char) file.readByte();
}

Otherwise you will get EOFException.

With your sample abc.txt I don't get java.lang.NumberFormatException: For input string: "" exception.

But if I add empty lines in abc.txt I get this exception sooner or later. Thus the problem is with empty lines somewhere in abc.txt.




回答2:


Any unparsable String if you pass to parseInt method then it will raise NumberFormatException. Like empty String and also Integer can hold the maximum & minimum value an int can have, 2147483647 or -2147483648. And if value goes beyond of that then it raise NumberFormatException

If the string does not contain a parsable integer. ([Documentation][1])



回答3:


It seems that s_LR contains an empty String.

From what I'm figuring this could happen if you have windows style linebreaks (\r\n) and hit the '\r' with the random seek. Then the break-conditions in both loops would apply, before any char is added to s_LR.

Sidenote: you are using a very atypical coding style for java. While it has no influence on your programm, it is harder to read/understand for other java programmers and therefore you may not get an answer.




回答4:


Really that is look like you have empty string at the end of file or at the beginning of the file.

Or one of the numbers to long for Integer.

I see two solutions:

  1. Add check for spaces and empty string for each element that you reads from file.
  2. Change Integer to Long value.


来源:https://stackoverflow.com/questions/17206864/numberformatexception-while-selecting-random-elements-from-a-big-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!