Java ByteBuffer performance issue

前端 未结 4 1772
臣服心动
臣服心动 2021-01-31 20:46

While processing multiple gigabyte files I noticed something odd: it seems that reading from a file using a filechannel into a re-used ByteBuffer object allocated with allocateD

相关标签:
4条回答
  • 2021-01-31 20:49

    A MappedByteBuffer will always be the fastest, because the operating system associates the OS-level disk buffer with your process memory space. Reading into an allocated direct buffer, by comparison, first loads the block into the OS buffer, then copies the contents of the OS buffer into the allocated in-process buffer.

    Your test code also does lots of very small (24 byte) reads. If your actual application does the same, then you'll get an even bigger performance boost from mapping the file, because each of the reads is a separate kernel call. You should see several times the performance by mapping.

    As for the direct buffer being slower than the java.io reads: you don't give any numbers, but I'd expect a slight degredation because the getLong() calls need to cross the JNI boundary.

    0 讨论(0)
  • 2021-01-31 20:54

    When you have a loop which iterates more than 10,000 times it can trigger the whole method to be compiled to native code. However, your later loops have not been run and cannot be optimised to the same degree. To avoid this issue, place each loop in a different method and run again.

    Additionally, you may want to set the Order for the ByteBuffer to be order(ByteOrder.nativeOrder()) to avoid all the bytes swapping around when you do a getLong and read more than 24 bytes at a time. (As reading very small portions generates much more system calls) Try reading 32*1024 bytes at a time.

    I wound also try getLong on the MappedByteBuffer with native byte order. This is likely to be the fastest.

    0 讨论(0)
  • 2021-01-31 21:06

    Reading into the direct byte buffer is faster, but getting the data out of it into the JVM is slower. Direct byte buffer is intended for cases where you're just copying the data without actually looking at it in the Java code. Then it doesn't have to cross the native->JVM boundary at all, so it's quicker than using e.g. a byte[] array or a normal ByteBuffer, where the data would have to cross that boundary twice in the copy process.

    0 讨论(0)
  • 2021-01-31 21:14

    I believe you are just doing micro-optimization, which might just not matter (www.codinghorror.com).

    Below is a version with a larger buffer and redundant seek / setPosition calls removed.

    • When I enable "native byte ordering" (which is actually unsafe if the machine uses a different 'endian' convention):
    mmap: 1.358
    bytebuffer: 0.922
    regular i/o: 1.387
    
    • When I comment out the order statement and use the default big-endian ordering:
    mmap: 1.336
    bytebuffer: 1.62
    regular i/o: 1.467
    
    • Your original code:
    mmap: 3.262
    bytebuffer: 106.676
    regular i/o: 90.903
    

    Here's the code:

    import java.io.File;
    import java.io.IOException;
    import java.io.RandomAccessFile;
    import java.nio.ByteBuffer;
    import java.nio.ByteOrder;
    import java.nio.channels.FileChannel;
    import java.nio.channels.FileChannel.MapMode;
    import java.nio.MappedByteBuffer;
    
    class Testbb2 {
        /** Buffer a whole lot of long values at the same time. */
        static final int BUFFSIZE = 0x800 * 8; // 8192
        static final int DATASIZE = 0x8000 * BUFFSIZE;
    
        static public long byteArrayToLong(byte [] in, int offset) {
            return ((((((((long)(in[offset + 0] & 0xff) << 8) | (long)(in[offset + 1] & 0xff)) << 8 | (long)(in[offset + 2] & 0xff)) << 8 | (long)(in[offset + 3] & 0xff)) << 8 | (long)(in[offset + 4] & 0xff)) << 8 | (long)(in[offset + 5] & 0xff)) << 8 | (long)(in[offset + 6] & 0xff)) << 8 | (long)(in[offset + 7] & 0xff);
        }
    
        public static void main(String [] args) throws IOException {
            long start;
            RandomAccessFile fileHandle;
            FileChannel fileChannel;
    
            // Sanity check - this way the convert-to-long loops don't need extra bookkeeping like BUFFSIZE / 8.
            if ((DATASIZE % BUFFSIZE) > 0 || (DATASIZE % 8) > 0) {
                throw new IllegalStateException("DATASIZE should be a multiple of 8 and BUFFSIZE!");
            }
    
            int pos;
            int nDone;
    
            // create file
            File testFile = new File("file.dat");
            fileHandle = new RandomAccessFile("file.dat", "rw");
    
            if (testFile.exists() && testFile.length() >= DATASIZE) {
                System.out.println("File exists");
            } else {
                testFile.delete();
                System.out.println("Preparing file");
                byte [] buffer = new byte[BUFFSIZE];
                pos = 0;
                nDone = 0;
                while (pos < DATASIZE) {
                    fileHandle.write(buffer);
                    pos += buffer.length;
                }
    
                System.out.println("File prepared");
            } 
            fileChannel = fileHandle.getChannel();
    
            // mmap()
            MappedByteBuffer mbb = fileChannel.map(FileChannel.MapMode.READ_WRITE, 0, DATASIZE);
            byte [] buffer1 = new byte[BUFFSIZE];
            mbb.position(0);
            start = System.currentTimeMillis();
            pos = 0;
            while (pos < DATASIZE) {
                mbb.get(buffer1, 0, BUFFSIZE);
                // This assumes BUFFSIZE is a multiple of 8.
                for (int i = 0; i < BUFFSIZE; i += 8) {
                    long dummy = byteArrayToLong(buffer1, i);
                }
                pos += BUFFSIZE;
            }
            System.out.println("mmap: " + (System.currentTimeMillis() - start) / 1000.0);
    
            // bytebuffer
            ByteBuffer buffer2 = ByteBuffer.allocateDirect(BUFFSIZE);
    //        buffer2.order(ByteOrder.nativeOrder());
            buffer2.order();
            fileChannel.position(0);
            start = System.currentTimeMillis();
            pos = 0;
            nDone = 0;
            while (pos < DATASIZE) {
                buffer2.rewind();
                fileChannel.read(buffer2);
                buffer2.rewind();   // need to rewind it to be able to use it
                // This assumes BUFFSIZE is a multiple of 8.
                for (int i = 0; i < BUFFSIZE; i += 8) {
                    long dummy = buffer2.getLong();
                }
                pos += BUFFSIZE;
            }
            System.out.println("bytebuffer: " + (System.currentTimeMillis() - start) / 1000.0);
    
            // regular i/o
            fileHandle.seek(0);
            byte [] buffer3 = new byte[BUFFSIZE];
            start = System.currentTimeMillis();
            pos = 0;
            while (pos < DATASIZE && nDone != -1) {
                nDone = 0;
                while (nDone != -1  && nDone < BUFFSIZE) {
                    nDone = fileHandle.read(buffer3, nDone, BUFFSIZE - nDone);
                }
                // This assumes BUFFSIZE is a multiple of 8.
                for (int i = 0; i < BUFFSIZE; i += 8) {
                    long dummy = byteArrayToLong(buffer3, i);
                }
                pos += nDone;
            }
            System.out.println("regular i/o: " + (System.currentTimeMillis() - start) / 1000.0);
        }
    }
    
    0 讨论(0)
提交回复
热议问题