Should I buffer the InputStream or the InputStreamReader?

后端 未结 4 2091

What are the differences (if any) between the following two buffering approaches?

Reader r1 = new BufferedReader(new InputStreamReader(in, \"UTF-8\"), bufferSize         


        
相关标签:
4条回答
  • 2021-02-05 03:47

    FWIW, if you're opening a file in Java 8, you can use the Files.newBufferedReader(Path). I don't know how the performance compares to the other solutions described here, but at least it pushes the decision of what construct to buffer into the JDK.

    0 讨论(0)
  • 2021-02-05 03:49

    r1 is more efficient. The InputStreamReader itself doesn't have a large buffer. The BufferedReader can be set to have a larger buffer than InputStreamReader. The InputStreamReader in r2 would act as a bottleneck.

    In a nut: you should read the data through a funnel, not through a bottle.


    Update: here's a little benchmark program, just copy'n'paste'n'run it. You don't need to prepare files.

    package com.stackoverflow.q3459127;
    
    import java.io.BufferedInputStream;
    import java.io.BufferedReader;
    import java.io.BufferedWriter;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileWriter;
    import java.io.IOException;
    import java.io.InputStreamReader;
    import java.io.Reader;
    
    public class Test {
    
        public static void main(String... args) throws Exception {
    
            // Init.
            int bufferSize = 10240; // 10KB.
            int fileSize = 100 * 1024 * 1024; // 100MB.
            File file = new File("/temp.txt");
    
            // Create file (it's also a good JVM warmup).
            System.out.print("Creating file .. ");
            BufferedWriter writer = null;
            try {
                writer = new BufferedWriter(new FileWriter(file));
                for (int i = 0; i < fileSize; i++) {
                    writer.write("0");
                }
                System.out.printf("finished, file size: %d MB.%n", file.length() / 1024 / 1024);
            } finally {
                if (writer != null) try { writer.close(); } catch (IOException ignore) {}
            }
    
            // Read through funnel.
            System.out.print("Reading through funnel .. ");
            Reader r1 = null;        
            try {
                r1 = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8"), bufferSize);
                long st = System.nanoTime();
                for (int data; (data = r1.read()) > -1;);
                long et = System.nanoTime();
                System.out.printf("finished in %d ms.%n", (et - st) / 1000000);
            } finally {
                if (r1 != null) try { r1.close(); } catch (IOException ignore) {}
            }
    
            // Read through bottle.
            System.out.print("Reading through bottle .. ");
            Reader r2 = null;        
            try {
                r2 = new InputStreamReader(new BufferedInputStream(new FileInputStream(file), bufferSize), "UTF-8");
                long st = System.nanoTime();
                for (int data; (data = r2.read()) > -1;);
                long et = System.nanoTime();
                System.out.printf("finished in %d ms.%n", (et - st) / 1000000);
            } finally {
                if (r2 != null) try { r2.close(); } catch (IOException ignore) {}
            }
    
            // Cleanup.
            if (!file.delete()) System.err.printf("Oops, failed to delete %s. Cleanup yourself.%n", file.getAbsolutePath());
        }
    
    }
    

    Results at my Latitude E5500 with a Seagate Momentus 7200.3 harddisk:

    Creating file .. finished, file size: 99 MB.
    Reading through funnel .. finished in 1593 ms.
    Reading through bottle .. finished in 7760 ms.
    
    0 讨论(0)
  • 2021-02-05 03:57

    r1 is also more convenient when you read line-based stream as BufferedReader supports readLine method. You don't have to read content into a char array buffer or chars one by one. However, you have to cast r1 to BufferedReader or use that type explicitly for the variable.

    I often use this code snippet:

    BufferedReader br = ...
    String line;
    while((line=br.readLine())!=null) {
      //process line
    }
    
    0 讨论(0)
  • 2021-02-05 04:05

    In response to Ross Studtman's question in the comment above (but also relevant to the OP):

    BufferedReader reader = new BufferedReader(new InputStreamReader(new BufferedInputSream(inputStream), "UTF-8"));
    

    The BufferedInputStream is superfluous (and likely harms performance due to extraneous copying). This is because the BufferedReader requests characters from the InputStreamReader in large chunks by calling InputStreamReader.read(char[], int, int), which in turn (through StreamDecoder) calls InputStream.read(byte[], int, int) to read a large block of bytes from the underlying InputStream.

    You can convince yourself that this is so by running the following code:

    new BufferedReader(new InputStreamReader(new ByteArrayInputStream("Hello world!".getBytes("UTF-8")) {
    
        @Override
        public synchronized int read() {
            System.err.println("ByteArrayInputStream.read()");
            return super.read();
        }
    
        @Override
        public synchronized int read(byte[] b, int off, int len) {
            System.err.println("ByteArrayInputStream.read(..., " + off + ", " + len + ')');
            return super.read(b, off, len);
        }
    
    }, "UTF-8") {
    
        @Override
        public int read() throws IOException {
            System.err.println("InputStreamReader.read()");
            return super.read();
        }
    
        @Override
        public int read(char[] cbuf, int offset, int length) throws IOException {
            System.err.println("InputStreamReader.read(..., " + offset + ", " + length + ')');
            return super.read(cbuf, offset, length);
        }
    
    }).read(); // read one character from the BufferedReader
    

    You will see the following output:

    InputStreamReader.read(..., 0, 8192)
    ByteArrayInputStream.read(..., 0, 8192)
    

    This demonstrates that the BufferedReader requests a large chunk of characters from the InputStreamReader, which in turn requests a large chunk of bytes from the underlying InputStream.

    0 讨论(0)
提交回复
热议问题