问题
new BufferedReader(new InputStreamReader(
new GZIPInputStream(s3Service.getObject(bucket, objectKey).getDataInputStream())))
creates Reader that returns null from readLine()
after ~100 lines if file is greater then several MB.
Not reproducible on gzip files less then 1 MB.
Does anybody knows how to handle this?
回答1:
From the documentation of BufferedReader#readLine()
:
Returns:
A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached
I would say it pretty clear what this means: The end of the file/stream has been encountered - no more data is available.
Notable quirks with the GZIP format: Multiple files can just be appended to one-another to create a larger file with multiple gzipped objects. It seems that the GZIPInputStream
only reads the first of those.
That also explains why it is working for "small files". Those contain only one zipped object, so the whole file is read.
Note: If the GZIPInputStream
determines undestructively that one gzip-file is over, you could just open another GZIPInputStream
on the same InputStream
and read multiple objects.
来源:https://stackoverflow.com/questions/31275728/gzipinputstream-is-prematurely-closed-when-reading-from-s3