问题
Versioning of Amazon S3 buckets is nice, but I don't see any easy way to compare versions of a file - either through the console or through any other app I found.
S3Browser seems to have the best versioning support, but no comparison.
Is there a way to compare versions of a file on S3 without downloading both versions and comparing them manually?
--
EDIT: I just started thinking that some basic automation should not be too hard, see snippet below. Question remains though: is there any tool that supports this properly? This script may be fine for me, but not for non-dev users.
#!/bin/bash
# s3-compare-last-versions.sh
if [[ $# -ne 2 ]]; then
echo "Usage: `basename $0` <bucketName> <fileKey> "
exit 1
fi
bucketName=$1
fileKey=$2
latestVersionId=$(aws s3api list-object-versions --bucket $bucketName --prefix $fileKey --max-items 2 | json Versions[0].VersionId)
previousVersionId=$(aws s3api list-object-versions --bucket $bucketName --prefix $fileKey --max-items 2 | json Versions[1].VersionId)
aws s3api get-object --bucket $bucketName --key $fileKey --version-id $latestVersionId $latestVersionId".js"
aws s3api get-object --bucket $bucketName --key $fileKey --version-id $previousVersionId $previousVersionId".js"
diff $latestVersionId".js" $previousVersionId".js"
回答1:
You can't view file contents at all via S3, so you definitely can't compare the contents of files via S3. You would have to download the different versions and then use a tool like diff
to compare them.
回答2:
I wrote a bash script to download the last two versions of an object and compare it using colordiff. I stumbled across this questions after writing it. Thought I could share it here if anyone wanted to use it.
#!/bin/bash
#This script needs awscli, jq and colordiff. Please install them for your environment
#This script also needs the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION.
#Please set them using the export command as follows or set them using envrc
#export AWS_ACCESS_KEY_ID=<Your AWS Access Key ID>
#export AWS_SECRET_ACCESS_KEY=<Your AWS Secret Access Key>
#export AWS_DEFAULT_REGION=<Your AWS Default Region>
set -e
if [ -z $1 ] || [ -z $2 ]; then
echo "Usage:"
echo "version_compare.sh *bucket_name* *file_name*"
echo
echo "Example"
echo "version_compare.sh bucket_name folder/filename.extension"
echo
exit 1;
fi
aws_bucket=$1
file_key=$2
echo Getting the last 2 versions of the file at ${file_key}..
echo
echo Executing:
cat << EOF
aws s3api list-object-versions --bucket ${aws_bucket} --prefix ${file_key} --max-items 2
EOF
echo
versions=$(aws s3api list-object-versions --bucket ${aws_bucket} --prefix ${file_key} --max-items 2)
version_1=$( jq -r '.["Versions"][0]["VersionId"]' <<< "${versions}" )
version_2=$( jq -r '.["Versions"][1]["VersionId"]' <<< "${versions}" )
mkdir -p state_comparison_files
echo Getting the latest version ${version_1} of the file at ${file_key}..
echo
echo Executing:
cat << EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_1} state_comparison_files/${version_1}
EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_1} state_comparison_files/${version_1} > /dev/null
echo
echo Getting older version ${version_2} of the file at ${file_key}..
echo
echo Executing:
cat << EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_2} state_comparison_files/${version_2}
EOF
aws s3api get-object --bucket ${aws_bucket} --key ${file_key} --version-id ${version_2} state_comparison_files/${version_2} > /dev/null
echo
echo Comparing the different versions.
echo If no differences are found, nothing will be shown
colordiff --unified state_comparison_files/${version_2} state_comparison_files/${version_1}
Here's the link to it
https://gist.github.com/mohamednajiullah/3edc88d314291be40f2dd3cf13ea0d7f
Note: It's pretty much the same as the script the question asker himself created except that it uses jq for json parsing and colordiff for showing the difference with different colors like in git diff.
I'm creating an electron.js based desktop app to do exactly this. It's currently in development but it can be used. I welcome contributions
https://github.com/mohamednajiullah/s3_object_version_comparator
回答3:
you can use MegaSparDiff an open source too that compares multiple types of datasources including S3
https://github.com/FINRAOS/MegaSparkDiff
the below pair will return inLeftButNotInRight and inRightButNotInLeft as DataFrames which you can save as files or you can examine the data via code.
SparkFactory.initializeSparkContext();
AppleTable leftAppleTable = SparkFactory.parallelizeTextSource("S3://file1","table1");
AppleTable rightAppleTable = SparkFactory.parallelizeTextSource("S3://file2","table2");
Pair<Dataset<Row>, Dataset<Row>> resultPair = SparkCompare.compareAppleTables(leftAppleTable, rightAppleTable);
resultPair.getLeft().show(100);
SparkFactory.stopSparkContext();
来源:https://stackoverflow.com/questions/40138780/how-to-compare-versions-of-an-amazon-s3-object