Java:How can i get the encoding from inputStream?

后端 未结 2 1946
执笔经年
执笔经年 2020-12-31 10:41

I want get the encoding from a stream.

1st method - to use the InputStreamReader.

But it always return OS encode.

I         


        
相关标签:
2条回答
  • 2020-12-31 10:58
        public String getDecoder(InputStream inputStream) {
    
        String encoding = null;
    
        try {
            byte[] buf = new byte[4096];
            UniversalDetector detector = new UniversalDetector(null);
            int nread;
    
            while ((nread = inputStream.read(buf)) > 0 && !detector.isDone()) {
                detector.handleData(buf, 0, nread);
            }
    
            detector.dataEnd();
            encoding = detector.getDetectedCharset();
            detector.reset();
    
            inputStream.close();
    
        } catch (Exception e) {
        }
    
        return encoding;
    }
    
    0 讨论(0)
  • 2020-12-31 11:05

    Let's resume the situation:

    • InputStream delivers bytes
    • *Readers deliver chars in some encoding
    • new InputStreamReader(inputStream) uses the operating system encoding
    • new InputStreamReader(inputStream, "UTF-8") uses the given encoding (here UTF-8)

    So one needs to know the encoding before reading. You did everything right using first a charset detecting class.

    Reading http://code.google.com/p/juniversalchardet/ it should handle UTF-8 and UTF-16. You might use the editor JEdit to verify the encoding, and see whether there is some problem.

    0 讨论(0)
提交回复
热议问题