I am developing a new feature for my android app to enable data backup and restore. I am using XML files to backup data. This is a piece of code that sets encoding for an ou
FileReader
and other readers don't detect encoding. They just use the platform default encoding which can be UTF-8 by coincidence. It has no relation to the actual encoding of the file.
You cannot detect XML file encoding until you read it enough to see the encoding
attribute.
From getInputEncoding() documentation
if inputEncoding is null and the parser supports the encoding detection feature, it must return the detected encoding
And:
If setInput(Reader) was called, null is returned.
So it appears that pre 11 doesn't support detection which is enabled by using setInput(is, null)
. I don't know how you are getting "UTF-8"
when using setInput(reader)
as the documentation says it should return null
.
Then:
After first call to next if XML declaration was present this method will return encoding declared.
So in pre 11, you could try calling .next()
intially before calling .getInputEncoding
After some more trial and error, I've finally managed to figure out what's going on. So despite the fact that the documentation says:
Historically Android has had two implementations of this interface: KXmlParser via XmlPullParserFactory.newPullParser(). ExpatPullParser, via Xml.newPullParser().
Either choice is fine. The example in this section uses ExpatPullParser, via Xml.newPullParser().
The reality is, that on older APIs, such as 2.3.3 Xml.newPullParser()
returns ExpatPullParser
object. While on Ice Cream Sandwich and up it returns KXmlParser
object. And as we can see from this blog post, android developers knew about this since December 2011:
In Ice Cream Sandwich we changed Xml.newPullParser() to return a KxmlParser and deleted our ExpatPullParser class.
...but never bothered to update the official documentation.
So how do you retrieve KXmlParser
object on APIs before Ice Cream Sandwich? Simple:
XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
XmlPullParser parser = factory.newPullParser();
...in fact this works on all versions of android, new and old. Then you supply a FileInputStream to your parser's setInput() method, leaving default encoding null
:
FileInputStream stream = null;
stream = new FileInputStream(file);
parser.setInput(stream, null);
After this, on APIs 11 and higher you can call parser.getInputEncoding() right away and it will return the correct encoding. But on pre-API11 versions, it will return null unless you call parser.next() first, as @Esailija correctly pointed out in his answer. Interestingly enough, on API11+ calling next() doesn't have any negative effect whatsoever, so you may safely use this code on all versions:
parser.next();
String encoding = parser.getInputEncoding();
And this will correctly return "UTF-8".