Count the bytes written to file via BufferedWriter formed by GZIPOutputStream

后端 未结 4 824
深忆病人
深忆病人 2021-01-15 21:24

I have a BufferedWriter as shown below:

BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(
        new GZIPOutputStream( hdfs.cr         


        
相关标签:
4条回答
  • 2021-01-15 21:28

    This is similar to the response by Olaseni, but I moved the counting into the BufferedOutputStream rather than the GZIPOutputStream, and this is more robust, since def.getBytesRead() in Olaseni's answer is not available after the stream has been closed.

    With the implementation below, you can supply your own AtomicLong to the constructor so that you can assign the CountingBufferedOutputStream in a try-with-resources block, but still retrieve the count after the block has exited (i.e. after the file is closed).

    public static class CountingBufferedOutputStream extends BufferedOutputStream {
        private final AtomicLong bytesWritten;
    
        public CountingBufferedOutputStream(OutputStream out) throws IOException {
            super(out);
            this.bytesWritten = new AtomicLong();
        }
    
        public CountingBufferedOutputStream(OutputStream out, int bufSize) throws IOException {
            super(out, bufSize);
            this.bytesWritten = new AtomicLong();
        }
    
        public CountingBufferedOutputStream(OutputStream out, int bufSize, AtomicLong bytesWritten)
                throws IOException {
            super(out, bufSize);
            this.bytesWritten = bytesWritten;
        }
    
        @Override
        public void write(byte[] b) throws IOException {
            super.write(b);
            bytesWritten.addAndGet(b.length);
        }
    
        @Override
        public void write(byte[] b, int off, int len) throws IOException {
            super.write(b, off, len);
            bytesWritten.addAndGet(len);
        }
    
        @Override
        public synchronized void write(int b) throws IOException {
            super.write(b);
            bytesWritten.incrementAndGet();
        }
    
        public long getBytesWritten() {
            return bytesWritten.get();
        }
    }
    
    0 讨论(0)
  • 2021-01-15 21:34

    You can use the CountingOutputStream from Apache commons IO library.

    Place it between the GZIPOutputStream and the file Outputstream (hdfs.create(..)).

    After writing the content to the file you can read the number of written bytes from the CountingOutputStream instance.

    0 讨论(0)
  • 2021-01-15 21:46

    You can make you own descendant of OutputStream and count how many time write method was invoked

    0 讨论(0)
  • 2021-01-15 21:48

    If this isn't too late and you are using 1.7+ and you don't wan't to pull in an entire library like Guava or Commons-IO, you can just extend the GZIPOutputStream and obtain the data from the associated Deflater like so:

    public class MyGZIPOutputStream extends GZIPOutputStream {
    
      public MyGZIPOutputStream(OutputStream out) throws IOException {
          super(out);
      }
    
      public long getBytesRead() {
          return def.getBytesRead();
      }
    
      public long getBytesWritten() {
          return def.getBytesWritten();
      }
    
      public void setLevel(int level) {
          def.setLevel(level);
      }
    }
    
    0 讨论(0)
提交回复
热议问题