I tried the java.util.zip package, it is too slow.
Then I found LZMA SDK and 7z jbinding but they are also lacking something.
The LZMA SDK does not provide a ki
I have not benchmarked the speed but with java 7 or greater, I extract a file as follows.
I would imagine that it's faster than the ZipFile API:
A short example extracting META-INF/MANIFEST.MF
from a zip file test.zip
:
// file to extract from zip file
String file = "MANIFEST.MF";
// location to extract the file to
File outputLocation = new File("D:/temp/", file);
// path to the zip file
Path zipFile = Paths.get("D:/temp/test.zip");
// load zip file as filesystem
try (FileSystem fileSystem = FileSystems.newFileSystem(zipFile)) {
// copy file from zip file to output location
Path source = fileSystem.getPath("META-INF/" + file);
Files.copy(source, outputLocation.toPath());
}
What does your code with java.util.zip
look like and how big of a zip file are you dealing with?
I'm able to extract a 4MB entry out of a 200MB zip file with 1,800 entries in roughly a second with this:
OutputStream out = new FileOutputStream("your.file");
FileInputStream fin = new FileInputStream("your.zip");
BufferedInputStream bin = new BufferedInputStream(fin);
ZipInputStream zin = new ZipInputStream(bin);
ZipEntry ze = null;
while ((ze = zin.getNextEntry()) != null) {
if (ze.getName().equals("your.file")) {
byte[] buffer = new byte[8192];
int len;
while ((len = zin.read(buffer)) != -1) {
out.write(buffer, 0, len);
}
out.close();
break;
}
}
Use a ZipFile rather than a ZipInputStream.
Although the documentation does not indicate this (it's in the docs for JarFile
), it should use random-access file operations to read the file. Since a ZIPfile contains a directory at a known location, this means a LOT less IO has to happen to find a particular file.
Some caveats: to the best of my knowledge, the Sun implementation uses a memory-mapped file. This means that your virtual address space has to be large enough to hold the file as well as everything else in your JVM. Which may be a problem for a 32-bit server. On the other hand, it may be smart enough to avoid memory-mapping on 32-bit, or memory-map just the directory; I haven't tried.
Also, if you're using multiple files, be sure to use a try
/finally
to ensure that the file is closed after use.