I have a question about the windows invariant culture.
Succinctly, my question is:
does there exist any pair of characters c1, and c2 such
By looking through the answers to this question:
win32-file-name-comparison
which I asked a while back.,
I found an indirect link the following page:
http://msdn.microsoft.com/en-us/library/ms973919.aspx
It suggests using an ordinal comparison after an invariant upper case as the best way to mimic what the file system does.
So I think if I use as "case sensitive, accent sensitive" collation in the database, and do a "upper" using the invariant local before storing the files I should be ok.
Does anyone know if there are any problems with that?
Why not URL-encode the utf8 byte representation of the filename to get an ascii version which can be converted back to unicode easily without any possible loss?
"However, I don't really have any idea what kind of mappings the invariant culture does, other than the fact that its what windows uses for comparing file names."
I didn't think Windows used the invariant culture when comparing file names. For example if my culture is English then I can name two separate files turkish and TURKİSH, but if someone's culture is Turkish then I hope Windows won't let them do that.
why don't you convert filenames to ASCII? In your situation can filenames contain non-ascii characters?