Why is using BufferedInputStream to read a file byte by byte faster than using FileInputStream?

前端 未结 3 1218
挽巷
挽巷 2020-11-28 19:49

I was trying to read a file into an array by using FileInputStream, and an ~800KB file took about 3 seconds to read into memory. I then tried the same code except with the F

相关标签:
3条回答
  • 2020-11-28 20:04

    A BufferedInputStream wrapped around a FileInputStream, will request data from the FileInputStream in big chunks (512 bytes or so by default, I think.) Thus if you read 1000 characters one at a time, the FileInputStream will only have to go to the disk twice. This will be much faster!

    0 讨论(0)
  • 2020-11-28 20:16

    It is because of the cost of disk access. Lets assume you will have a file which size is 8kb. 8*1024 times access disk will be needed to read this file without BufferedInputStream.

    At this point, BufferedStream comes to the scene and acts as a middle man between FileInputStream and the file to be read.

    In one shot, will get chunks of bytes default is 8kb to memory and then FileInputStream will read bytes from this middle man. This will decrease the time of the operation.

    private void exercise1WithBufferedStream() {
          long start= System.currentTimeMillis();
            try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
                BufferedInputStream bufferedInputStream = new BufferedInputStream(myFile);
                boolean eof = false;
                while (!eof) {
                    int inByteValue = bufferedInputStream.read();
                    if (inByteValue == -1) eof = true;
                }
            } catch (IOException e) {
                System.out.println("Could not read the stream...");
                e.printStackTrace();
            }
            System.out.println("time passed with buffered:" + (System.currentTimeMillis()-start));
        }
    
    
        private void exercise1() {
            long start= System.currentTimeMillis();
            try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
                boolean eof = false;
                while (!eof) {
                    int inByteValue = myFile.read();
                    if (inByteValue == -1) eof = true;
                }
            } catch (IOException e) {
                System.out.println("Could not read the stream...");
                e.printStackTrace();
            }
            System.out.println("time passed without buffered:" + (System.currentTimeMillis()-start));
        }
    
    0 讨论(0)
  • 2020-11-28 20:19

    In FileInputStream, the method read() reads a single byte. From the source code:

    /**
     * Reads a byte of data from this input stream. This method blocks
     * if no input is yet available.
     *
     * @return     the next byte of data, or <code>-1</code> if the end of the
     *             file is reached.
     * @exception  IOException  if an I/O error occurs.
     */
    public native int read() throws IOException;
    

    This is a native call to the OS which uses the disk to read the single byte. This is a heavy operation.

    With a BufferedInputStream, the method delegates to an overloaded read() method that reads 8192 amount of bytes and buffers them until they are needed. It still returns only the single byte (but keeps the others in reserve). This way the BufferedInputStream makes less native calls to the OS to read from the file.

    For example, your file is 32768 bytes long. To get all the bytes in memory with a FileInputStream, you will require 32768 native calls to the OS. With a BufferedInputStream, you will only require 4, regardless of the number of read() calls you will do (still 32768).

    As to how to make it faster, you might want to consider Java 7's NIO FileChannel class, but I have no evidence to support this.


    Note: if you used FileInputStream's read(byte[], int, int) method directly instead, with a byte[>8192] you wouldn't need a BufferedInputStream wrapping it.

    0 讨论(0)
提交回复
热议问题