GZIPInputStream reading line by line

百般思念 提交于 2019-11-27 06:08:27
erickson

The basic setup of decorators is like this:

InputStream fileStream = new FileInputStream(filename);
InputStream gzipStream = new GZIPInputStream(fileStream);
Reader decoder = new InputStreamReader(gzipStream, encoding);
BufferedReader buffered = new BufferedReader(decoder);

The key issue in this snippet is the value of encoding. This is the character encoding of the text in the file. Is it "US-ASCII", "UTF-8", "SHIFT-JIS", "ISO-8859-9", …? there are hundreds of possibilities, and the correct choice usually cannot be determined from the file itself. It must be specified through some out-of-band channel.

For example, maybe it's the platform default. In a networked environment, however, this is extremely fragile. The machine that wrote the file might sit in the neighboring cubicle, but have a different default file encoding.

Most network protocols use a header or other metadata to explicitly note the character encoding.

In this case, it appears from the file extension that the content is XML. XML includes the "encoding" attribute in the XML declaration for this purpose. Furthermore, XML should really be processed with an XML parser, not as text. Reading XML line-by-line seems like a fragile, special case.

Failing to explicitly specify the encoding is against the second commandment. Use the default encoding at your peril!

ChssPly76
GZIPInputStream gzip = new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"));
BufferedReader br = new BufferedReader(new InputStreamReader(gzip));
br.readLine();

Arumugam Mathiazhagan
BufferedReader in = new BufferedReader(new InputStreamReader(
        new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"))));

String content;

while ((content = in.readLine()) != null)

   System.out.println(content);

You can use the following method in a util class, and use it whenever necessary...

public static List<String> readLinesFromGZ(String filePath) {
    List<String> lines = new ArrayList<>();
    File file = new File(filePath);

    try (GZIPInputStream gzip = new GZIPInputStream(new FileInputStream(file));
            BufferedReader br = new BufferedReader(new InputStreamReader(gzip));) {
        String line = null;
        while ((line = br.readLine()) != null) {
            lines.add(line);
        }
    } catch (FileNotFoundException e) {
        e.printStackTrace(System.err);
    } catch (IOException e) {
        e.printStackTrace(System.err);
    }
    return lines;
}
Tamer

here is with one line

try (BufferedReader br = new BufferedReader(
        new InputStreamReader(
           new GZIPInputStream(
              new FileInputStream(
                 "F:/gawiki-20090614-stub-meta-history.xml.gz"))))) 
     {br.readLine();}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!