If I wanted to create a string which is guaranteed not to represent a filename, I could put one of the following characters in it on Windows:
\\ / : * ? | &l
An empty string is the only truly invalid path name on Linux, which may work for you if you need only one invalid name. You could also use a string like "///foo
", which would not be a canonical path name, although it could refer to a file ("/foo
"). Another possibility would be something like "/dev/null/foo
", since /dev/null
has a POSIX-defined non-directory meaning. If you only need strings that could not refer to a regular file you could use "/
" or ".
", since those are always directories.
There are almost no restrictions - apart from '/'
and '\0'
, you're allowed to use anything. However, some people think it's not a good idea to allow this much flexibility.
I personally find that a lot of the time the problem is not Linux but the applications one is using on Linux.
Take for example Amarok. Recently I noticed that certain artists I had copied from my Windows machine where not appearing in the library. I check and confirmed that the files were there and then I noticed that certain characters in the folder names (Named for the artist) were represented with a weird-looking square rather than an actual character.
In a shell terminal the filenames look even stranger: /Music/Albums/Einst$'\374'rzende\ Neubauten is an example of how strange.
While these files were definitely there, Amarok could not see them for some reason. I was able to use some shell trickery to rename them to sane versions which I could then re-name with ASCII-only characters using Musicbrainz Picard. Unfortunately, Picard was also unable to open the files until I renamed them, hence the need for a shell script.
Overall this a a tricky area and it seems to get very thorny if you are trying to synchronise a music collection between Windows and Linux wherein certain folder or file names contain funky characters.
The safest thing to do is stick to ASCII-only filenames.