问题
Below is some code which extracts a file from a zip file containing only a single file. However, the extracted file does not match the same file extracted via WinZip or other zip utility. I expect that it might be off by a byte if the file contains an odd number of bytes (because my buffer is size 2 and I just abort once the read fails). However, when analyzing (using WinMerge or Diff) the file extracted with code below vs. file extracted via Winzip, there are several areas where bytes are missing from the Java extraction. Does anyone know why or how I can fix this?
package zipinputtest;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.zip.ZipInputStream;
public class test2 {
public static void main(String[] args) {
try {
ZipInputStream zis = new ZipInputStream(new FileInputStream("C:\\temp\\sample3.zip"));
File outputfile = new File("C:\\temp\\sample3.bin");
OutputStream os = new BufferedOutputStream(new FileOutputStream(outputfile));
byte[] buffer2 = new byte[2];
zis.getNextEntry();
while(true) {
if(zis.read(buffer2) != -1) {
os.write(buffer2);
}
else break;
}
os.flush();
os.close();
zis.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
I was able to produce the error using this image (save it and zip as sample3.zip and run the code on it), but any binary file of sufficient size should show the discrepancies.
回答1:
while (true) {
if(zis.read(buffer2) != -1) {
os.write(buffer2);
}
else break;
}
Usual problem. You're ignoring the count. Should be:
int count;
while ((count = zis.read(buffer2)) != -1)
{
os.write(buffer2, 0, count);
}
NB:
- A buffer size of 2 is ridiculous. Use 8192 or more.
flush()
beforeclose()
is redundant.
回答2:
You can use a more verbatim way to check whether all bytes are read and written, e.g. a method like
public int extract(ZipInputStream in, OutputStream out) throws IOException {
byte[] buffer = new byte[BUFFER_SIZE];
int total = 0;
int read;
while ((read = in.read(buffer)) != -1) {
total += read;
out.write(buffer, 0, read);
}
return total;
}
If the read
parameter is not used in write()
, the method assumes that the entire contents of the buffer
will be written out, which may not be correct, if the buffer
is not fully filled.
The OutputStream
can be flushed and closed inside or outside the extract()
method. Calling close()
should be enough, since it also calls flush()
.
In any case, the "standard" I/O code of Java, like the java.util.zip
package, have been tested and used extensively, so it is highly unlikely it could have a bug so fundamental as to cause bytes to be missed so easily.
来源:https://stackoverflow.com/questions/43508690/java-zipinputstream-extraction-errors