问题
What are the security considerations when a server fetches a file from an untrusted domain?
What are the security considerations when resizing an image that you don't trust with PHPs GD2 library?
The file will be stored on the server machine, and will be offered for download. I know I can't trust the MIME-Type header. Is there anything else I should be aware of?
I have a webservice that looks like this:
input
An http-URL (or a String that is expected to be a URL)
output
A meta description of the file, or an error if there was one.
The meta description has one of two forms:
- It's an image + a URL to the image on my domain + a thumbnail of the image (generated on and hosted by my server)
- It's not an image + a URL to the file on my domain
update
Concerns that I can come up with:
The remote server is a malicious server that will send tiny bits of information, enough to keep the socket open, but doesn't do anything useful - like slowloris. I don't know how real of a threat this is. I suppose it could be easily avoided with timeout + progress check.
The remote server serves something that looks like an image (headers, mime-type) but causes PHP to crash when I load it with GD2.
The server sends a useless or bad MIME-type header. Like
text-plain
for binary files.The remote server serves an image with a virus in it. I assume that resizing the image will get rid of the virus, but I will serve the original image if there is no reason to scale.
The remote server serves a file with a virus in it. The file will not be treated as an image so my server will do nothing with it. Nothing will happen until the user downloads, and runs it.
Also, I assume I can trust the users of my service. This is a private application in a situation where users can be held accountable for bad behavior. I assume they wont intentionally try to break it.
回答1:
What are the security considerations when a server fetches a file from an untrusted domain?
The domain (host) and the file is not to be trusted. This spreads over two points:
- Transport
- Data
To transport the data safely, use a timeout and a size limit. Modern HTTP client libraries offer both of that. If the file could not be requested in time, drop the connection. If the file is too large, drop the data. Tell the user that there was a problem getting the file. Alternatively let the user handle the transport to that server by using the users browser and javascript to obtain the file. Then post it. Set the post limit with your script.
As long as the data is untrusted you need to handle it with caution. That means, you implement yourself a process that is able to run different security checks on the file before you mark it as "safe".
What are the security considerations when resizing an image that you don't trust with PHPs GD2 library?
Do not pass untrusted data to the image library then. See the step above, bring it into a safe state first.
The file will be stored on the server machine, and will be offered for download. I know I can't trust the MIME-Type header. Is there anything else I should be aware of?
I think you're still at the point above. How to come to safe from untrusted. Sure you can't trust the Content-Type header, however it's good to understand it as well.
You want to protect against the Unrestricted File Upload VulnerabilityOWASP.
- Check the filename. If you store the data on your server, give it a safe temporary name that can not be guessed upfront and that is not accessible via the web.
- Check the data associated with the filename, e.g. the URL information of the source of that file. Properly handle encoding.
- Drop anything that does not meet your expectations, so check the pre-conditions you formulate strictly.
- Validate the file data before you continue, for example by using a virus checker.
- Validate the image data before you continue. This includes file-headers (magic numbers) as well as that the file-size and file-content is valid. You should use a library that has specialized for the job, e.g. an image-file-format-malformation-checker. This is specialized software, so if this part of your business get into business. Many free software image file code exists, I leave this just for the info, you can't trust any recommendation anyway and need to get into the topic.
- If you plan to resize the image yourself, you need to make everything double-safe, because next to hosting you plan to process the data. So know what you do with the data first to locate potential fields of problems.
- Do logging and monitoring.
- Have a plan for the case that everything get's wrong.
- Consider to repeat the process for already existing files, so if you change your procedure, you are able to automatically apply the principles to uploads that were done in the past as well.
- Create a system for each type of work that is able to be cleaned after the work has been done. One system to do the download, one system to obtain the meta data etc.. After each action, restore the system from an image. If a single components fails, it won't be left over in an exploited state. Additionally if you detect a fail, you can take your whole system out of business until you have found the flaw.
All this depends a bit how much you want to do, but I think you get the idea. Create a process that works for you knowing where improvement can be added, but first create an infrastructure that is modular enough to deal with error-cases and which probably encapsulates the process enough to deal with any outcome.
You could delegate critical parts to a system that you don't need to care about, e.g. to separate processing from hosting. Additionally, when you host the images the webserver must not be clever. The more stupid a system is, the less exploitable it is (normally).
If hosting is not part of your business, why not hand it over to amazon s3 or similar stores? Your domain can be preserved via DNS settings.
Keep the libraries you use to verify images with up-to-date (which implicates you know which libraries are used and their versio, e.g. the PHP exif extension is making use of mbstring etc. pp. - track the whole tree down). Take care you're in the position to report flaws to the library maintainers in a useful way, e.g. with logging, storing upload data to reproduce stuff etc..
Get knowledge about which exploits for images did exist in the past and which systems/components/libraries (example, see disclaimer there) were affected.
Also get into the topic which are common ways to exploit something, to get the basics together (I'm sure you are aware, however it's always good to re-read some stuff):
- Secure file upload in PHP web applications (Alla Bezroutchko; June 13, 2007; PDF)
Some related questions, assorted:
- Is it important to verify that the uploaded file is an actual image file?
- PHP Upload file enhance security
回答2:
What you're describing basically comes down to an input validation problem; you don't trust what your application is reading in as input and processing.
To address this, what you should do is to download the resource in question and then attempt to determine a true file type. There are multiple ways to attempt this, but basically you will want to use either some custom-code or a library to parse through the file and look for the tell-tail signs of a certain type. There is a good SO discussion on how to do this in PHP here - How can I determine a file's true extension/type programatically? - I would check the second answer that lists some PHP-specific functions to do this. When your application receives a file, it should perform some true file typing like this and then compare the result to what the specified MIME type from the remote server is; if they match accept the file and if they do not, drop it.
I would also suggest using a whitelist of allowable filetypes (a list of everything your service will support and then ONLY accept files of those types). If you have a very general-purpose service, then you should at least do a blacklist of disallowed filetypes (a list of everything your service absolutely will not support and drop those immediately based on the outcome of your MIME type compares). Again, the use of these is entirely dependent on your use-cases.
Once you've got a type, the concern becomes if what the remote server has sent you is a bad file that targets your server (contains malicious code, buffer overflow designed to make the GD2 library blow up and run arbitrary code, etc). Basically, you are relying on the GD2 library to not contain bugs that would lead to such a successful exploit. There's not much you can do here, short of running security audit on the library yourself and I'm going to assume that's out-of-scope. Basically, keep up on any reported security bugs with the library and patch as soon as you can; as a consumer of the library, you are really relying on the maintainers to find and remedy security vulnerabilities like this.
Next, the concern is that the remote server has sent you a bad file that targets your users/clients (contains malicious code, buffer overflows, viruses, etc). Here, if there is corrupted data that is really malware in the image, it will most likely either (1) break or exploit GD2 when it is read (see above for that scenario) or (2) be eliminated when the resize operation is performed by the library if GD2 can successfully process it. There is still a chance it will remain despite the processing, but there's not much you can do there either. If you're really concerned about this, you can apply a virusscan using an external product designed for that; I would suggest that if you're doing that to do so both (1) after the download and before GD2 processing and then (2) on the manipulated file before you serve it out. Personally, I don't think you get much by doing this, but if you want to provide an additional check / warm fuzzies to your users, it cannot hurt.
To address the slow-feeding of data to keep a connection open, put a timeout on any connection to deal with this problem; unless you are dealing with a specific threat to your use-case here, I do not think this is a huge concern.
回答3:
1) My primary concern with blindly fetching a file from an untrusted domain would be how to verify that the file is, in fact, what you expected to get.; could the untrusted server trick your script into downloading a harmful file (like a virus) or possibly a script that would allow a backdoor into your system?
2) I haven't read any security issues with resizing an image with the GD2 library. If it's not an image to begin with, the GD2 functions would throw an error. I don't think you have much to worry about with this part.
3) I (personally) would not ever do this without reviewing every single file that my script downloaded first. If you want to partially automate this, you might consider running magic number tests on all the files as a pre-filter. But a human look is the safest way to serve random files. When you finish this project - before you make it live - try to break / trick / hack it as hard as you can. Get some knowledgeable friends involved to help.
回答4:
when it is not an image you store the file any way regardless what kind of file? so they can upload and php file and browse to it to execute php code on your server?
来源:https://stackoverflow.com/questions/8606951/fetching-a-file-on-a-server-resizing-with-php-gd2-security-considerations