My input file(f) has some Unicode (Swedish) that isn\'t being read correctly.
Neither of these approaches works, although they give different results:
I assume that you mean 'UTF-8' when you say 'Unicode'.
If you know that the file is UTF-8, then do
LoadFromFile(f, TEncoding.UTF8).
To save:
SaveToFile(f, TEncoding.UTF8);
(The GetOEMCP
WinAPI function is for old 255-character character sets.)
In order to load a Unicode text file you need to know its encoding. If the file has a Byte Order Mark (BOM), then you can simply call LoadFromFile(FileName)
and the RTL will use the BOM to determine the encoding.
If the file does not have a BOM then you need to explicitly specify the encoding, e.g.
LoadFromFile(FileName, TEncoding.UTF8);
LoadFromFile(FileName, TEncoding.Unicode);//UTF-16 LE
LoadFromFile(FileName, TEncoding.BigEndianUnicode);//UTF-16 BE
For some reason, unknown to me, there is no built in support for UTF-32, but if you had such a file then it would be easy enough to add a TEncoding
instance to handle that.