Given these two images from twitter.
http://a3.twimg.com/profile_images/130500759/lowres_profilepic.jpg
http://a1.twimg.com/profile_images/58079916/lowres_pr
The git content management system is based on SHA1 because it has very minimal chance for collision.
If it good for git it will be good to you so.
It appears that the numerical part of twimg.com URLs are already a unique value for each image. My research indicates that the number is sequential (i.e. the example url below is for the 433,484,366th profile image ever uploaded - which just happens to be mine). Thus, this number is unique. My solution would be to simply use the numerical part of the filename as the "hash value", with no fear of ever finding a non-unique value.
I already use this system for a Python script that displays notifications for new tweets, and as part of its operation it caches profile image thumbnails to reduce unneccessary downloads.
P.S. It makes no difference what subdomain the image is downloaded from, all images are available from all subdomains.
The nature of a hash is that it may result in collisions. How about one of these alternatives:
A very simple approach:
f( "http://a3.twimg.com/profile_images/130500759/" ) = a3_130500759.jpg
f( "http://a1.twimg.com/profile_images/58079916/" ) = a1_58079916.jpg
As the other parts of this URL are constant, you can use the subdomain, the last part of the query path as a unique filename.
Don't know what could be a problem with this solution
It seems what you really want is to have a legal filename that won't collide with others.
filename = base64(url)
I'm playing with thumbalizr using a modified version of their caching script, and it has a few good solutions I think. The code is on github.com/mptre/thumbalizr but the short version is that is uses md5 to build the file names, and it takes the first two characters from the filename and uses it to create a folder which is named the exact same thing. This means that it is easy to break the folders up, and fast to find the corresponding folder without a database. Kind of blew my mind with it's simplicity.
It generates file names like this http://pappmaskin.no/opensource/delicious_snapcasa/mptre-thumbalizr/cache/fc/fcc3a328e0f4c1b51bf5e13747614e7a_1280_1024_8_90_250.png
the last part, _1280_1024_8_90_250, matches the different settings that the script uses when talking to the thumbalizr api, but I guess fcc3a328e0f4c1b51bf5e13747614e7a is a straight md5 of the url, in this case for thumbalizr.com
I tried changing the config to generate images 200px wide, and that images goes in the same folder, but instead of _250.png it is called _200.png
I haven't had time to dig that much in the code, but I'm sure it could be pulled apart from the thumbalizr logic and made more generic.