I have retrieved a zip entry from a zip file like so.
InputStream input = params[0];
ZipInputStream zis = new ZipInputStream(input);
ZipEntry entry;
try {
w
Here is the approach, which does not break Unicode characters:
final ZipInputStream zis = new ZipInputStream(new ByteArrayInputStream(content));
final InputStreamReader isr = new InputStreamReader(zis);
final StringBuilder sb = new StringBuilder();
final char[] buffer = new char[1024];
while (isr.read(buffer, 0, buffer.length) != -1) {
sb.append(new String(buffer));
}
System.out.println(sb.toString());
I would use apache's IOUtils
ZipEntry entry;
InputStream input = params[0];
ZipInputStream zis = new ZipInputStream(input);
try {
while ((entry = zis.getNextEntry())!= null) {
String entryAsString = IOUtils.toString(zis, StandardCharsets.UTF_8);
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
IOUtils.closeQuietly(zis);
you have to read from the ZipInputStream
:
StringBuilder s = new StringBuilder();
byte[] buffer = new byte[1024];
int read = 0;
ZipEntry entry;
while ((entry = zis.getNextEntry())!= null) {
while ((read = zis.read(buffer, 0, 1024)) >= 0) {
s.append(new String(buffer, 0, read));
}
}
When you exit from the inner while
save the StringBuilder
content, and reset it.
With set encoding (UTF-8) and without creation of strings:
import java.util.zip.ZipInputStream;
import java.util.zip.ZipEntry;
import java.io.ByteArrayOutputStream;
import static java.nio.charset.StandardCharsets.UTF_8;
String charset = "UTF-8";
try (
ZipInputStream zis = new ZipInputStream(input, UTF_8);
ByteArrayOutputStream baos = new ByteArrayOutputStream()
) {
byte[] buffer = new byte[1024];
int read = 0;
ZipEntry entry;
while ((entry = zis.getNextEntry()) != null)
while ((read = zis.read(buffer, 0, buffer.length)) > 0)
baos.write(buffer, 0, read);
String content = baos.toString(charset);
}