问题
Is it possible to add a key to s3 with an utf-8 encoded name like "åøæ.jpg"?
I'm getting the following error when uploading with boto:
<Error><Code>InvalidURI</Code><Message>Couldn't parse the specified URI.</Message>
回答1:
@2083: This is a bit of an old question, but if you haven't found the solution, and for everyone else that comes here like me looking for an answer:
From the official documentation (http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html):
Although you can use any UTF-8 characters in an object key name, the following key naming best practices help ensure maximum compatibility with other applications. Each application may parse special characters differently. The following guidelines help you maximize compliance with DNS, web safe characters, XML parsers, and other APIs.
Safe Characters
The following character sets are generally safe for use in key names:
Alphanumeric characters [0-9a-zA-Z]
Special characters !, -, _, ., *, ', (, and )
The following are examples of valid object key names:
4my-organization
my.great_photos-2014/jan/myvacation.jpg
videos/2014/birthday/video1.wmv
However, if what you really want, like me, is a filename that allows UTF-8 characters (note that this can be different from the key name). You have a way to do it!
From http://www.bennadel.com/blog/2591-embedding-foreign-characters-in-your-content-disposition-filename-header.htm and http://www.bennadel.com/blog/2696-overriding-content-type-and-content-disposition-headers-in-amazon-s3-pre-signed-urls.htm (Kudos to Ben Nadal) you can do that by making sure that when downloading the file, S3 will override the Content-Disposition header.
As I have done it in java, I include here the code, I'm sure you'll be able to easily translate it to Python :) :
AmazonS3 s3 = S3Controller.getS3Client();
//as per http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html
String key = fileName.substring(fileName.indexOf("-")).replaceAll("[^a-zA-Z0-9._]", "");
PutObjectRequest putObjectRequest = new PutObjectRequest(
S3Controller.bucketNameForBucket(S3Controller.Bucket.EXPORT_BUCKET),
key,
file);
// we can always regenerate these files, so we can used reduced redundancy storage
putObjectRequest.setStorageClass(StorageClass.Standard);
String urlEncodedUTF8Filename = key;
try {
//http://www.bennadel.com/blog/2696-overriding-content-type-and-content-disposition-headers-in-amazon-s3-pre-signed-urls.htm
//http://www.bennadel.com/blog/2591-embedding-foreign-characters-in-your-content-disposition-filename-header.htm
//Issue#179
urlEncodedUTF8Filename = URLEncoder.encode(fileName.substring(fileName.indexOf("-")), "UTF-8");
} catch (UnsupportedEncodingException e) {
LOG.warn("Could not URLEncode a filename. Original Filename: " + fileName, e );
}
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentDisposition("attachment; filename=\"" + key + "\"; filename*=UTF-8''"+ urlEncodedUTF8Filename);
putObjectRequest.setMetadata(metadata);
s3.putObject(putObjectRequest);
It should help :)
回答2:
From AWS FAQ: A key is a sequence of Unicode characters whose UTF-8 encoding is at most 1024 bytes long.
From my experience, use ASCII.
来源:https://stackoverflow.com/questions/21074800/utf-8-filename-in-s3-bucket