问题
When trying to write some UTF8 data to a file, I end up with some garbage in the file. The code is as follows
public static boolean saveToFile(StringBuffer buffer,
String fileName,
ArrayList exceptionList,
String className)
{
log.debug("In saveToFile for file [" + fileName + "]");
RandomAccessFile raf = null;
File file = new File(fileName);
File backupFile = new File(fileName+"_bck");
try
{
if (file.exists())
{
if (backupFile.exists())
{
backupFile.delete();
}
file.renameTo(backupFile);
}
raf = new RandomAccessFile(file, "rw");
raf.writeBytes(buffer.toString());
raf.close();
The output of buffer.toString() is
<?xml version="1.0" encoding="UTF-8"?>
<ivr>
<version>1.1</version>
<templateName>αβγδεζη
The data in the file however is
<?xml version="1.0" encoding="UTF-8"?>
<ivr>
<version>1.1</version>
<templateName>▒▒▒▒▒▒▒</templateName>
How can I make sure that data i nthe file itself is UTF8
回答1:
I'm not surpised you get garbage:
raf.writeBytes(buffer.toString())
The documentation for RandomAccessFile.writeBytes(String) says (emphasis added):
Writes the string to the file as a sequence of bytes. Each character in the string is written out, in sequence, by discarding its high eight bits.
In a few circumstances, that operation will result in a correctly encoded file. But in most it won't. That writeBytes()
method is a foolish design by the Java developers. You need to correctly encode your text as bytes in UTF-8, and then write those bytes.
Do you really need to operate on the file as a random access file. If not, just manipulate it with a Writer
wrapping an OutputStream
.
You could use Charset.encode(CharBuffer) to produce a ByteBuffer
holding the encoded bytes, then write those bytes to the file:
raf.write(StandardCharsets.UTF_8.encode(buffer).array());
回答2:
The Javadoc for RandomAccessFile states that for writeBytes()
Writes the string to the file as a sequence of bytes. Each character in the string is written out, in sequence, by discarding its high eight bits. The write starts at the current position of the file pointer.
Assuming that discarding parts of your String isn't what you want, you should be using writeUtf():
Writes a string to the file using modified UTF-8 encoding in a machine-independent manner.
来源:https://stackoverflow.com/questions/24932750/how-to-write-utf8-data-to-xml-file-using-randomaccessfile