Poor Performance of Java's unzip utilities

后端 未结 3 2144
Happy的楠姐
Happy的楠姐 2021-02-09 02:07

I have noticed that the unzip facility in Java is extremely slow compared to using a native tool such as WinZip.

Is there a third party library available for Java that i

相关标签:
3条回答
  • 2021-02-09 02:46

    The problem is not the unzipping, it's the inefficient way you write the unzipped data back to disk. My benchmarks show that using

        InputStream is = zip.getInputStream(entry); // get the input stream
        OutputStream os = new java.io.FileOutputStream(f);
        byte[] buf = new byte[4096];
        int r;
        while ((r = is.read(buf)) != -1) {
          os.write(buf, 0, r);
        }
        os.close();
        is.close();
    

    instead reduces the method's execution time by a factor of 5 (from 5 to 1 second for a 6 MB zip file).

    The likely culprit is your use of bis.available(). Aside from being incorrect (available returns the number of bytes until a call to read would block, not until the end of the stream), this bypasses the buffering provided by BufferedInputStream, requiring a native system call for every byte copied into the output file.

    Note that wrapping in a BufferedStream is not necessary if you use the bulk read and write methods as I do above, and that the code to close the resources is not exception safe (if reading or writing fails for any reason, neither is nor os would be closed). Finally, if you have IOUtils in the class path, I recommend using their well tested IOUtils.copy instead of rolling your own.

    0 讨论(0)
  • 2021-02-09 02:52

    I have found an 'inelegant' solution. There is an open source utility 7zip (www.7-zip.org) that is free to use. You can download the command line version (http://www.7-zip.org/download.html). 7-zip is only supported on Windows, but it looks like this has been ported to other platforms (p7zip).

    Obviously this solution is not ideal since it is platform specific and relies on an executable. However, the speed compared to doing the unzip in Java is incredible.

    Here is the code for the utility function that I created to interface with this utility. There is room for improvement as the code below is Windows specific.

    /** Unpacks the zipfile to the output directory.  Note: this code relies on 7-zip 
       (specifically the cmd line version, 7za.exe).  The exeDir specifies the location of the 7za.exe utility. */
    public static void unpack(File zipFile, File outputDir, File exeDir) throws IOException, InterruptedException
    {
      if (!zipFile.exists()) throw new FileNotFoundException(zipFile.getAbsolutePath());
      if (!exeDir.exists()) throw new FileNotFoundException(exeDir.getAbsolutePath());
      if (!outputDir.exists()) outputDir.mkdirs();
    
      String cmd = exeDir.getAbsolutePath() + "/7za.exe -y e " + zipFile.getAbsolutePath();
    
      ProcessBuilder builder = new ProcessBuilder(new String[] { "cmd.exe", "/C", cmd });
      builder.directory(outputDir);
      Process p = builder.start();
      int rc = p.waitFor();
      if (rc != 0) {
        log.severe("Util::unpack() 7za process did not complete normally.  rc: " + rc);
      }
    }      
    
    0 讨论(0)
  • 2021-02-09 02:58

    Make sure you are feeding the unzip method a BufferedInputStream in your Java application. If you have made the mistake of using an unbuffered input stream your IO performance is guaranteed to suck.

    0 讨论(0)
提交回复
热议问题