How do you go about verifying the type of an uploaded file reliably without using the extension? I\'m guessing that you have to examine the header / read some of the bytes,
That indeed is what the Unix file
program does, with greater or lesser degrees of reliability. In part, it depends on whether the programs whose files you are trying to detect emits a file header; the program tar
is notorious for not doing so. It depends on how many types of files you plan to try and recognize, but it might well be simplest to use an implementation of file
; it recognizes many file types, and modern versions are extensible via a file of extra file type definitions that can handle a multitude of scenarios.
The first few bytes of a file will often tell you the file type. See, for example,
http://www.garykessler.net/library/file_sigs.html
http://www.astro.keele.ac.uk/oldusers/rno/Computing/File_magic.html
Use System.IO to read the byes as binary after the upload.
I'm curious, though, why you can't rely on on the ContentType header?
Wotsit is a good resource for finding out the magic numbers for various file types.
Here's a quick-and-dirty response to the followup question you posted:
byte[] jpg = new byte[] { 0xFF, 0xD8, 0xFF, 0xE0 };
bool match = true;
for (int i = 0; i < jpg.Length; i++)
{
if (jpg[i] != b[i])
{
match = false;
break;
}
}
Reading the contents of the file is the fool proof way. Since you are building it in .Net, you could probably check the MIME Type of the uploaded file.
You can DllImport urlmon.dll to help. Please refer a post at: http://coding-passion.blogspot.com/2008/11/validating-file-type.html
And to clarify regarding Content-type, it invariably is driven by the extension of the file. So even a .zip file got its extension renamed to .txt, the content type will still say Text only.