java.util.zip - ZipInputStream v.s. ZipFile

前端未结

关注

 3  1536

I have some general questions regarding the java.util.zip library. What we basically do is an import and an export of many small components. Previously these c

相关标签:

3条回答

佛祖请我去吃肉

2021-01-05 06:23

I measured that just listing the files with ZipInputStream is 8 times slower than with ZipFile.

    long t = System.nanoTime();
    ZipFile zip = new ZipFile(jarFile);
    Enumeration<? extends ZipEntry> entries = zip.entries();
    while (entries.hasMoreElements())
    {
        ZipEntry entry = entries.nextElement();

        String filename = entry.getName();
        if (!filename.startsWith(JAR_TEXTURE_PATH))
            continue;

        textureFiles.add(filename);
    }
    zip.close();
    System.out.println((System.nanoTime() - t) / 1e9);

and

    long t = System.nanoTime();
    ZipInputStream zip = new ZipInputStream(new FileInputStream(jarFile));
    ZipEntry entry;
    while ((entry = zip.getNextEntry()) != null)
    {
        String filename = entry.getName();
        if (!filename.startsWith(JAR_TEXTURE_PATH))
            continue;

        textureFiles.add(filename);
    }
    zip.close();
    System.out.println((System.nanoTime() - t) / 1e9);

(Don't run them in the same class. Make two different classes and run them separately)

0 讨论(0)

野趣味

2021-01-05 06:35

Regarding Q3, experience in JENKINS-14362 suggests that zlib is not thread-safe even when operating on unrelated streams, i.e. that it has some improperly shared static state. Not proven, just a warning.

0 讨论(0)
发布评论:

提交评论
- 加载中...
鱼传尺愫

2021-01-05 06:46

Q1: yes, order will be the same in which entries were added.

Q2: note that due to structure of zip archive files, and compression, none of solutions is exactly streaming; they all do some level of buffering. And if you check out JDK sources, implementations share most code. There is no real random access to within content, although index does allow finding chunks that correspond to entries. So I think there should not be meaningful performance differences; especially as OS will do caching of disk blocks anyway. You may want to just test performance to verify this with a simple test case.

Q3: I would not count on this; and most likely they aren't. If you really think concurrent access would help (mostly because decompression is CPU bound, so it might help), I'd try reading the whole file in memory, expose via ByteArrayInputStream, and construct multiple independent readers.

0 讨论(0)
发布评论:

提交评论
- 加载中...