I\'m facing a problem.
A file can be written in some encoding such as UTF-8
, UTF-16
, UTF-32
, etc.
When I read a UTF-
There is no good way to do that. The question you're asking is like determining the radix of a number by looking at it. For example, what is the radix of 101
?
Best solution would be to read the data into a byte array. Then you can use String(byte[] bytes, Charset charset) to test it with multiple encodings, most likely to least likely.
You cannot. Which transformation format applies is usually determined by the first four bytes of the file (assuming a BOM). You cannot see those just from the outside.
You can read the first few bytes and try to guess the encoding.
If all else fails, try reading with different encodings until one works (no exception when decoding and it 'looks' OK).