问题
How can I open a file in UTF-8 and write to another file in UTF-16?
I need an example because I'm having issues with some characters like 'é' and 'a'.
When writing "médic", I find in the file written "m@#dic".
回答1:
You can create a reader as follows:
InputStream is = new FileInputStream(inputFile);
InputStreamReader in = new InputStreamReader(is, "UTF-8");
and a writer as follows:
OutputStream os = new FileOutputStream(outputFile);
OutputStreamWriter out = new OutputStreamWriter(os, "UTF-16");
回答2:
Do this:
try (
final BufferedReader reader = Files.newBufferedReader(srcpath,
StandardCharsets.UTF_8);
final BufferedWriter writer = Files.newBufferedWriter(dstpath,
StandardCharsets.UTF_16BE);
) {
final char[] buf = new char[4096];
int nrChars;
while ((nrChars = reader.read(buf)) != -1)
writer.write(buf, 0, nrChars);
writer.flush();
}
NOTE: chosen big endian UTF-16. You didn't tell which one you wanted. If you want little endian, use UTF_16LE
instead.
Also, if you want to skip the bom, just:
reader.read();
before looping for writing chars. The BOM is a single code point which happens to be in the BMP, so this will work.
回答3:
Adding to what fge said in his comment, I don't think changing the encoding when you write it out is your problem. My guess is that the file that you're reading isn't in UTF-8. Open that file with an editor like PsPad in hexmode and look at the first two or three bytes of the file for the byte order mark (BOM). If it has the UTF-8 BOM, then I'm wrong. If it doesn't have a BOM at all then the file is probably in the OS's default encoding and not UTF-8. If there is no BOM then you can usually verify what encoding by looking at a character outside of the ASCII range and seeing what the bytes actually are.
来源:https://stackoverflow.com/questions/27584861/how-to-open-file-in-utf-8-and-write-in-another-file-in-utf-16