How to estimate zip file size in java before creating it

后端 未结 7 423
时光取名叫无心
时光取名叫无心 2021-01-08 00:27

I am having a requirement wherein i have to create a zip file from a list of available files. The files are of different types like txt,pdf,xml etc.I am using java util clas

相关标签:
7条回答
  • 2021-01-08 01:03

    I did this once on a project with known input types. We knew that general speaking our data compressed around 5:1 (it was all text.) So, I'd check the file size and divide by 5...

    In this case, the purpose for doing so was to check that files would likely be below a certain size. We only needed a rough estimate.

    All that said, I have noticed zip applications like 7zip will create a zip file of a certain size (like a CD) and then split the zip off to a new file once it reaches the limit. You could look at that source code. I have actually used the command line version of that app in code before. They have a library you can use as well. Not sure how well that will integrate with Java though.

    For what it is worth, I've also used a library called SharpZipLib. It was very good. I wonder if there is a Java port to it.

    0 讨论(0)
  • 2021-01-08 01:06

    Maybe you could add a file each time, until you reach the 5MB limit, and then discard the last file. Like @Gopi, I don't think there is any way to estimate it without actually compressing the file.

    Of course, file size will not increase (or maybe a little, because of the zip header?), so at least you have a "worst case" estimation.

    0 讨论(0)
  • 2021-01-08 01:19

    +1 for Colin Herbert: Add files one by one, either back up the previous step or removing the last file if the archive is to big. I just want to add some details:

    Prediction is way too unreliable. E.g. a PDF can contain uncompressed text, and compress down to 30% of the original, or it contains already-compressed text and images, compressing to 80%. You would need to inspect the entire PDF for compressibility, basically having to compress them.

    You could try a statistical prediction, but that could reduce the number of failed attempts, but you would still have to implement above recommendation. Go with the simpler implementation first, and see if it's enough.

    Alternatively, compress files individually, then pick the files that won't exceedd 5 MB if bound together. If unpacking is automated, too, you could bind the zip files into a single uncompressed zip file.

    0 讨论(0)
  • 2021-01-08 01:21

    just wanted to share how we implemented manual way

                int maxSizeForAllFiles = 70000; // Read from property
            int sizePerFile = 22000; // Red from property
            /**
             * Iterate all attachment list to verify if ZIP is required
             */
            for (String attachFile : inputAttachmentList) {
                File file = new File(attachFile);
                totalFileSize += file.length();
                /**
                 * if ZIP required ??? based on the size
                 */
                if (file.length() >= sizePerFile) {
                    toBeZipped = true;
                    logger.info("File: "
                                + attachFile
                                    + " Size: "
                                    + file.length()
                                    + " File required to be zipped, MAX allowed per file: "
                                    + sizePerFile);
                    break;
                }
            }
            /**
             * Check if all attachments put together cross MAX_SIZE_FOR_ALL_FILES
             */
            if (totalFileSize >= maxSizeForAllFiles) {
                toBeZipped = true;
            }
            if (toBeZipped) {
                // Zip Here iterating all attachments
            }
    
    0 讨论(0)
  • 2021-01-08 01:24

    Wrap your ZipOutputStream into a personalized OutputStream, named here YourOutputStream.

    • The constructor of YourOutputStream will create another ZipOutputStream (zos2) which wraps a new ByteArrayOutputStream (baos)
      public YourOutputStream(ZipOutputStream zos, int maxSizeInBytes)
    • When you want to write a file with YourOutputStream, it will first write it on zos2
      public void writeFile(File file) throws ZipFileFullException
      public void writeFile(String path) throws ZipFileFullException
      etc...
    • if baos.size() is under maxSizeInBytes
      • Write the file in zos1
    • else
      • close zos1, baos, zos2 an throw an exception. For the exception, I can't think of an already existant one, if there is, use it, else create your own IOException ZipFileFullException.

    You need two ZipOutputStream, one to be written on your drive, one to check if your contents is over 5MB.

    EDIT : In fact I checked, you can't remove a ZipEntry easily.

    http://download.oracle.com/javase/6/docs/api/java/io/ByteArrayOutputStream.html#size()

    0 讨论(0)
  • 2021-01-08 01:27

    I dont think there is any way to estimate the size of zip that will be created because the zips are processed as streams. Also it would not be technically possible to predict the size of the created compressed format unless you actually compress it.

    0 讨论(0)
提交回复
热议问题