I need to create a mapping between file names generated on Windows and OS X. I know that OS X \"converts all file names to decomposed Unicode\" however, \"most volume forma
You're probably looking for -[NSString fileSystemRepresentation]
method.
Note that there is no general solution for this task. What is a valid file name depends on filesystem of the volume you're saving on. Not every file name valid for HFS+ is valid for FAT32, for example.
For Mac's “standard” filesystem (currently HFS+), fileSystemRepresentation
should give what you need; for other file systems, there is no general way. Think about ones that don't exist but will be introduced in the future, for example :)
According to your link, filesystem drivers appear to (mostly) follow one of two behaviours: * Return all names in NFD, and convert names as appropriate. * Don't perform any conversions.
In both these cases, if you create a file on OSX in NFD, reading it back on OSX should give you the name in NFD.
OTOH, if your filename goes from Windows → NFS → Mac and you want to do some sort of sync, you're out of luck. This is not an easy thing to do, since the underlying problem is a little philosophical: Should filenames be byte strings or Unicode strings? I believe Unix traditionally does the former, and at least in Linux, UTF-8 NFC names are merely a convention.
(It gets worse, since IIRC HFS+ is defined to use Unicode 3.something, so a naïve conversion to NFD might be wrong for characters added/changed since then unless the API you use can guarantee a specific Unicode version.)
I think the answer is this from TechNote 1150 HFS Plus Volume Format:
Note: The Mac OS Text Encoding Converter provides several constants that let you convert to and from the canonical, decomposed form stored on HFS Plus volumes. When using CreateTextEncoding to create a text encoding, you should set the TextEncodingBase to kTextEncodingUnicodeV2_0, set the TextEncodingVariant to kUnicodeCanonicalDecompVariant, and set the TextEncodingFormat to kUnicode16BitFormat. Using these values ensures that the Unicode will be in the same form as on an HFS Plus volume, even as the Unicode standard evolves.