What is the fastest way to extract 1 file from a zip file which contain a lot of file?

前端 未结 3 1572
忘掉有多难
忘掉有多难 2021-02-02 01:08

I tried the java.util.zip package, it is too slow.

Then I found LZMA SDK and 7z jbinding but they are also lacking something.

The LZMA SDK does not provide a ki

相关标签:
3条回答
  • 2021-02-02 01:31

    I have not benchmarked the speed but with java 7 or greater, I extract a file as follows.
    I would imagine that it's faster than the ZipFile API:

    A short example extracting META-INF/MANIFEST.MF from a zip file test.zip:

    // file to extract from zip file
    String file = "MANIFEST.MF";
    // location to extract the file to
    File outputLocation = new File("D:/temp/", file);
    // path to the zip file
    Path zipFile = Paths.get("D:/temp/test.zip");
    
    // load zip file as filesystem
    try (FileSystem fileSystem = FileSystems.newFileSystem(zipFile)) {
        // copy file from zip file to output location
        Path source = fileSystem.getPath("META-INF/" + file);
        Files.copy(source, outputLocation.toPath());
    }
    
    0 讨论(0)
  • 2021-02-02 01:43

    What does your code with java.util.zip look like and how big of a zip file are you dealing with?

    I'm able to extract a 4MB entry out of a 200MB zip file with 1,800 entries in roughly a second with this:

    OutputStream out = new FileOutputStream("your.file");
    FileInputStream fin = new FileInputStream("your.zip");
    BufferedInputStream bin = new BufferedInputStream(fin);
    ZipInputStream zin = new ZipInputStream(bin);
    ZipEntry ze = null;
    while ((ze = zin.getNextEntry()) != null) {
        if (ze.getName().equals("your.file")) {
            byte[] buffer = new byte[8192];
            int len;
            while ((len = zin.read(buffer)) != -1) {
                out.write(buffer, 0, len);
            }
            out.close();
            break;
        }
    }
    
    0 讨论(0)
  • 2021-02-02 01:47

    Use a ZipFile rather than a ZipInputStream.

    Although the documentation does not indicate this (it's in the docs for JarFile), it should use random-access file operations to read the file. Since a ZIPfile contains a directory at a known location, this means a LOT less IO has to happen to find a particular file.

    Some caveats: to the best of my knowledge, the Sun implementation uses a memory-mapped file. This means that your virtual address space has to be large enough to hold the file as well as everything else in your JVM. Which may be a problem for a 32-bit server. On the other hand, it may be smart enough to avoid memory-mapping on 32-bit, or memory-map just the directory; I haven't tried.

    Also, if you're using multiple files, be sure to use a try/finally to ensure that the file is closed after use.

    0 讨论(0)
提交回复
热议问题