What characters should be restricted from a Unix file name?

后端 未结 7 625
闹比i
闹比i 2020-12-02 14:56

Consider a Save As dialog with a free text entry where the user enters a file name as free text, then clicks a Save button. The software t

相关标签:
7条回答
  • 2020-12-02 15:42

    Do not forget that you can add a dot (.) at the beginning to hide files and folders... Otherwise, I'd follow a *NIX name convention (from Wikipedia):

    Most UNIX file systems

    • Case handling: case-sensitive case-preservation
    • Allowed character set: any.
    • Reserved characters: /, null.
    • Max length: 255.
    • Notes: A leading . indicates that ls and file managers will not by default show the file

    Link to wikipedia article about file names

    0 讨论(0)
  • 2020-12-02 15:42

    Encode FTW

    As Bombe points out in their answer, restricting user input is at least frustrating if not downright annoying. Though, as developers we should assume that every interaction with our code is malicious and treat them as such.

    To solve both problems in a practical application, rather than white-or-black-listing certain characters, we should simply not use the user input as the file name.

    Instead, use a safe name (hex chars [a-f0-9] only for ultimate safety) of our own devising, either encoded from the user input (e.g. PHP's bin2hex), or a randomly generated ID (e.g. PHP's uniqid) which is then mapped by some method (take your pick) to the user input.

    Encoding/decoding can be done on the fly with no reliance on mapping, so is practically ideal. The user never needs to know what the file is really called; as long as they can get/set the file, and it appears to be called what they wanted, everyone's a winner.

    By this methodology, the user can call their file whatever they like, hackers will be the only people frustrated, and your file system will love you :-)

    0 讨论(0)
  • 2020-12-02 15:43

    Firstly, what you're describing is black listing. Your better option is to white list your characters, as it is easier (from a user perspective) to have characters inserted rather than taken away.

    In terms of what would be good in a unix environment:

    • a-z
    • A-Z
    • 0-9
    • underscore (_)
    • dash (-)
    • period (.)

    Should cover your basics. Spaces can be okay, but make things difficult. Windows users love them, unix/linux don't. So depending on your target audience choose accordingly.

    0 讨论(0)
  • 2020-12-02 15:51

    The minimum are slash ('/') and NULL ('\0')

    0 讨论(0)
  • 2020-12-02 15:54

    Let the user enter whatever name he wants. Artificially restricting the range of characters will only annoy the users and serve no real purpose.

    0 讨论(0)
  • 2020-12-02 16:00

    Often forgotten: the colon (:) is not a good idea, since it's commonly used in stuff like $PATH, i.e. the list of directories where executables are found "automatically". This can cause confusion with DOS/Windows directory names, where of course the colon is used in drive names.

    0 讨论(0)
提交回复
热议问题