How can I sanitize a string for use as a filename?

后端 未结 9 774
说谎
说谎 2020-12-23 22:56

I\'ve got a routine that converts a file into a different format and saves it. The original datafiles were numbered, but my routine gives the output a filename based on an

相关标签:
9条回答
  • 2020-12-23 23:20

    Try this on a modern delphi:

     use System.IOUtils;
     ...
     result := TPath.HasValidFileNameChars(FileName, False)
    

    I allows also to have german umlauts or other chars like -, _,.. in a filename.

    0 讨论(0)
  • 2020-12-23 23:21

    For anyone else reading this and wanting to use PathCleanupSpec, I wrote this test routine which seems to work... there is a definate lack of examples on the 'net. You need to include ShlObj.pas (not sure when PathCleanupSpec was added but I tested this in Delphi 2010) You will also need to check for XP sp2 or higher

    procedure TMainForm.btnTestClick(Sender: TObject);
    var
      Path: array [0..MAX_PATH - 1] of WideChar;
      Filename: array[0..MAX_PATH - 1] of WideChar;
      ReturnValue: integer;
      DebugString: string;
    
    begin
      StringToWideChar('a*dodgy%\filename.$&^abc',FileName, MAX_PATH);
      StringToWideChar('C:\',Path, MAX_PATH);
      ReturnValue:= PathCleanupSpec(Path,Filename);
      DebugString:= ('Cleaned up filename:'+Filename+#13+#10);
      if (ReturnValue and $80000000)=$80000000 then
        DebugString:= DebugString+'Fatal result. The cleaned path is not a valid file name'+#13+#10;
      if (ReturnValue and $00000001)=$00000001 then
        DebugString:= DebugString+'Replaced one or more invalid characters'+#13+#10;
      if (ReturnValue and $00000002)=$00000002 then
        DebugString:= DebugString+'Removed one or more invalid characters'+#13+#10;
      if (ReturnValue and $00000004)=$00000004 then
        DebugString:= DebugString+'The returned path is truncated'+#13+#10;
      if (ReturnValue and $00000008)=$00000008 then
        DebugString:= DebugString+'The input path specified at pszDir is too long to allow the formation of a valid file name from pszSpec'+#13;
      ShowMessage(DebugString);
    end;
    
    0 讨论(0)
  • 2020-12-23 23:23

    Check if string has invalid chars; solution from here:

    //test if a "fileName" is a valid Windows file name
    //Delphi >= 2005 version
    
    function IsValidFileName(const fileName : string) : boolean;
    const 
      InvalidCharacters : set of char = ['\', '/', ':', '*', '?', '"', '<', '>', '|'];
    var
      c : char;
    begin
      result := fileName <> '';
    
      if result then
      begin
        for c in fileName do
        begin
          result := NOT (c in InvalidCharacters) ;
          if NOT result then break;
        end;
      end;
    end; (* IsValidFileName *)
    

    And, for strings returning False, you could do something simple like this for each invalid character:

    var
      before, after : string;
    
    begin
      before := 'i am a rogue file/name';
    
      after  := StringReplace(before, '/', '',
                          [rfReplaceAll, rfIgnoreCase]);
      ShowMessage('Before = '+before);
      ShowMessage('After  = '+after);
    end;
    
    // Before = i am a rogue file/name
    // After  = i am a rogue filename
    
    0 讨论(0)
  • 2020-12-23 23:26
    {
      CleanFileName
      ---------------------------------------------------------------------------
    
      Given an input string strip any chars that would result
      in an invalid file name.  This should just be passed the
      filename not the entire path because the slashes will be
      stripped.  The function ensures that the resulting string
      does not hae multiple spaces together and does not start
      or end with a space.  If the entire string is removed the
      result would not be a valid file name so an error is raised.
    
    }
    
    function CleanFileName(const InputString: string): string;
    var
      i: integer;
      ResultWithSpaces: string;
    begin
    
      ResultWithSpaces := InputString;
    
      for i := 1 to Length(ResultWithSpaces) do
      begin
        // These chars are invalid in file names.
        case ResultWithSpaces[i] of 
          '/', '\', ':', '*', '?', '"', '<', '>', '|', ' ', #$D, #$A, #9:
            // Use a * to indicate a duplicate space so we can remove
            // them at the end.
            {$WARNINGS OFF} // W1047 Unsafe code 'String index to var param'
            if (i > 1) and
              ((ResultWithSpaces[i - 1] = ' ') or (ResultWithSpaces[i - 1] = '*')) then
              ResultWithSpaces[i] := '*'
            else
              ResultWithSpaces[i] := ' ';
    
            {$WARNINGS ON}
        end;
      end;
    
      // A * indicates duplicate spaces.  Remove them.
      result := ReplaceStr(ResultWithSpaces, '*', '');
    
      // Also trim any leading or trailing spaces
      result := Trim(Result);
    
      if result = '' then
      begin
        raise(Exception.Create('Resulting FileName was empty Input string was: '
          + InputString));
      end;
    end;
    
    0 讨论(0)
  • 2020-12-23 23:28

    Regarding the question whether there is any API function to sanitize a file a name (or even check for its validity) - there seems to be none. Quoting from the comment on the PathSearchAndQualify() function:

    There does not appear to be any Windows API that will validate a path entered by the user; this is left as an an ad hoc exercise for each application.

    So you can only consult the rules for file name validity from File Names, Paths, and Namespaces (Windows):

    • Use almost any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:

      • The following reserved characters are not allowed:
        < > : " / \ | ? *
      • Characters whose integer representations are in the range from zero through 31 are not allowed.
      • Any other character that the target file system does not allow.
    • Do not use the following reserved device names for the name of a file: CON, PRN, AUX, NUL, COM1..COM9, LPT1..LPT9.
      Also avoid these names followed immediately by an extension; for example, NUL.txt is not recommended.

    If you know that your program will only ever write to NTFS file systems you can probably be sure that there are no other characters that the file system does not allow, so you would only have to check that the file name is not too long (use the MAX_PATH constant) after all invalid chars have been removed (or replaced by underscores, for example).

    A program should also make sure that the file name sanitizing has not lead to file name conflicts and it silently overwrites other files which ended up with the same name.

    0 讨论(0)
  • 2020-12-23 23:31
    // for all platforms (Windows\Unix), uses IOUtils.
    function ReplaceInvalidFileNameChars(const aFileName: string; const aReplaceWith: Char = '_'): string;
    var
      i: integer;
    begin
      Result := aFileName;
      for i := Low(Result) to High(Result) do
        if not TPath.IsValidFileNameChar(Result[i]) then
          Result[i] := aReplaceWith;
      end;
    end.
    
    0 讨论(0)
提交回复
热议问题