How to check if InputStream is Gzipped?

后端 未结 10 1935
一个人的身影
一个人的身影 2020-12-04 19:06

Is there any way to check if InputStream has been gzipped? Here\'s the code:

public static InputStream decompressStream(InputStream input) {
    try {
               


        
相关标签:
10条回答
  • 2020-12-04 19:30

    Wrap the original stream in a BufferedInputStream, then wrap that in a GZipInputStream. Next try to extract a ZipEntry. If this works, it's a zip file. Then you can use "mark" and "reset" in the BufferedInputStream to return to the initial position in the stream, after your check.

    0 讨论(0)
  • 2020-12-04 19:36

    I believe this is simpliest way to check whether a byte array is gzip formatted or not, it does not depend on any HTTP entity or mime type support

    public static boolean isGzipStream(byte[] bytes) {
          int head = ((int) bytes[0] & 0xff) | ((bytes[1] << 8) & 0xff00);
          return (GZIPInputStream.GZIP_MAGIC == head);
    }
    
    0 讨论(0)
  • 2020-12-04 19:38

    It's not foolproof but it's probably the easiest and doesn't rely on any external data. Like all decent formats, GZip too begins with a magic number which can be quickly checked without reading the entire stream.

    public static InputStream decompressStream(InputStream input) {
         PushbackInputStream pb = new PushbackInputStream( input, 2 ); //we need a pushbackstream to look ahead
         byte [] signature = new byte[2];
         int len = pb.read( signature ); //read the signature
         pb.unread( signature, 0, len ); //push back the signature to the stream
         if( signature[ 0 ] == (byte) 0x1f && signature[ 1 ] == (byte) 0x8b ) //check if matches standard gzip magic number
           return new GZIPInputStream( pb );
         else 
           return pb;
    }
    

    (Source for the magic number: GZip file format specification)

    Update: I've just dicovered that there is also a constant called GZIP_MAGIC in GZipInputStream which contains this value, so if you really want to, you can use the lower two bytes of it.

    0 讨论(0)
  • 2020-12-04 19:39

    This function works perfectly well in Java:

    public static boolean isGZipped(File f) {   
        val raf = new RandomAccessFile(file, "r")
        return GZIPInputStream.GZIP_MAGIC == (raf.read() & 0xff | ((raf.read() << 8) & 0xff00))
    }
    

    In scala:

    def isGZip(file:File): Boolean = {
       int gzip = 0
       RandomAccessFile raf = new RandomAccessFile(f, "r")
       gzip = raf.read() & 0xff | ((raf.read() << 8) & 0xff00)
       raf.close()
       return gzip == GZIPInputStream.GZIP_MAGIC
    }
    
    0 讨论(0)
  • 2020-12-04 19:41

    Building on the answer by @biziclop - this version uses the GZIP_MAGIC header and additionally is safe for empty or single byte data streams.

    public static InputStream maybeDecompress(InputStream input) {
        final PushbackInputStream pb = new PushbackInputStream(input, 2);
    
        int header = pb.read();
        if(header == -1) {
            return pb;
        }
    
        int b = pb.read();
        if(b == -1) {
            pb.unread(header);
            return pb;
        }
    
        pb.unread(new byte[]{(byte)header, (byte)b});
    
        header = (b << 8) | header;
    
        if(header == GZIPInputStream.GZIP_MAGIC) {
            return new GZIPInputStream(pb);
        } else {
            return pb;
        }
    }
    
    0 讨论(0)
  • 2020-12-04 19:47

    SimpleMagic is a Java library for resolving content types:

    <!-- pom.xml -->
        <dependency>
            <groupId>com.j256.simplemagic</groupId>
            <artifactId>simplemagic</artifactId>
            <version>1.8</version>
        </dependency>
    

    import com.j256.simplemagic.ContentInfo;
    import com.j256.simplemagic.ContentInfoUtil;
    import com.j256.simplemagic.ContentType;
    // ...
    
    public class SimpleMagicSmokeTest {
    
        private final static Logger log = LoggerFactory.getLogger(SimpleMagicSmokeTest.class);
    
        @Test
        public void smokeTestSimpleMagic() throws IOException {
            ContentInfoUtil util = new ContentInfoUtil();
            InputStream possibleGzipInputStream = getGzipInputStream();
            ContentInfo info = util.findMatch(possibleGzipInputStream);
    
            log.info( info.toString() );
            assertEquals( ContentType.GZIP, info.getContentType() );
        }
    
    0 讨论(0)
提交回复
热议问题