How to check the content of an uploaded file without relying on its extension?

前端 未结 5 1833
天命终不由人
天命终不由人 2021-01-06 07:08

How do you go about verifying the type of an uploaded file reliably without using the extension? I\'m guessing that you have to examine the header / read some of the bytes,

相关标签:
5条回答
  • 2021-01-06 07:45

    That indeed is what the Unix file program does, with greater or lesser degrees of reliability. In part, it depends on whether the programs whose files you are trying to detect emits a file header; the program tar is notorious for not doing so. It depends on how many types of files you plan to try and recognize, but it might well be simplest to use an implementation of file; it recognizes many file types, and modern versions are extensible via a file of extra file type definitions that can handle a multitude of scenarios.

    0 讨论(0)
  • 2021-01-06 07:45

    The first few bytes of a file will often tell you the file type. See, for example,
    http://www.garykessler.net/library/file_sigs.html
    http://www.astro.keele.ac.uk/oldusers/rno/Computing/File_magic.html

    Use System.IO to read the byes as binary after the upload.

    I'm curious, though, why you can't rely on on the ContentType header?

    0 讨论(0)
  • 2021-01-06 07:51

    Wotsit is a good resource for finding out the magic numbers for various file types.

    0 讨论(0)
  • Here's a quick-and-dirty response to the followup question you posted:

    byte[] jpg = new byte[] { 0xFF, 0xD8, 0xFF, 0xE0 };
    bool match = true;
    for (int i = 0; i < jpg.Length; i++)
    {
        if (jpg[i] != b[i])
        {
            match = false;
            break;
        }
    }
    
    0 讨论(0)
  • 2021-01-06 08:04

    Reading the contents of the file is the fool proof way. Since you are building it in .Net, you could probably check the MIME Type of the uploaded file.

    You can DllImport urlmon.dll to help. Please refer a post at: http://coding-passion.blogspot.com/2008/11/validating-file-type.html

    And to clarify regarding Content-type, it invariably is driven by the extension of the file. So even a .zip file got its extension renamed to .txt, the content type will still say Text only.

    0 讨论(0)
提交回复
热议问题