I\'ve got a routine that converts a file into a different format and saves it. The original datafiles were numbered, but my routine gives the output a filename based on an
Well, the easy thing is to use a regex and your favourite language's version of gsub
to replace anything that's not a "word character." This character class would be "\w
" in most languages with Perl-like regexes, or "[A-Za-z0-9]
" as a simple option otherwise.
Particularly, in contrast to some of the examples in other answers, you don't want to look for invalid characters to remove, but look for valid characters to keep. If you're looking for invalid characters, you're always vulnerable to the introduction of new characters, but if you're looking for only valid ones, you might be slightly less inefficient (in that you replaced a character you didn't really need to), but at least you'll never be wrong.
Now, if you want to make the new version as much like the old as possible, you might consider replacement. Instead of deleting, you can substitute a character or characters you know to be ok. But doing that is an interesting enough problem that it's probably a good topic for another question.
I did this:
// Initialized elsewhere...
string folder;
string name;
var prepl = System.IO.Path.GetInvalidPathChars();
var frepl = System.IO.Path.GetInvalidFileNameChars();
foreach (var c in prepl)
{
folder = folder.Replace(c,'_');
name = name.Replace(c, '_');
}
foreach (var c in frepl)
{
folder = folder.Replace(c, '_');
name = name.Replace(c, '_');
}
You can use PathGetCharType function, PathCleanupSpec function or the following trick:
function IsValidFilePath(const FileName: String): Boolean;
var
S: String;
I: Integer;
begin
Result := False;
S := FileName;
repeat
I := LastDelimiter('\/', S);
MoveFile(nil, PChar(S));
if (GetLastError = ERROR_ALREADY_EXISTS) or
(
(GetFileAttributes(PChar(Copy(S, I + 1, MaxInt))) = INVALID_FILE_ATTRIBUTES)
and
(GetLastError=ERROR_INVALID_NAME)
) then
Exit;
if I>0 then
S := Copy(S,1,I-1);
until I = 0;
Result := True;
end;
This code divides string into parts and uses MoveFile to verify each part. MoveFile will fail for invalid characters or reserved file names (like 'COM') and return success or ERROR_ALREADY_EXISTS for valid file name.
PathCleanupSpec is in the Jedi Windows API under Win32API/JwaShlObj.pas