How to list files inside tar in AWS S3 without downloading it?

前端 未结 1 1428
旧巷少年郎
旧巷少年郎 2021-01-19 04:01

While looking around for ideas I found https://stackoverflow.com/a/54222447/264822 for zip files which I think is a very clever solution. But it relies on zip files having a

相关标签:
1条回答
  • 2021-01-19 04:19

    My mistake. I'm actually dealing with tar.gz files but I assumed that zip and tar.gz are similar. They're not - tar is an archive file which is then compressed as gzip, so to read the tar you have to decompress it first. My idea of pulling bits out of the tar file won't work.

    What does work is:

    s3_object = s3client.get_object(Bucket=bucket_name, Key=file_name)
    wholefile = s3_object['Body'].read()
    fileobj = io.BytesIO(wholefile)
    tarf = tarfile.open(fileobj=fileobj)
    names = tarf.getnames()
    for name in names:
        print(name)
    

    I suspect the original code will work for a tar file but I don't have any to try it on.

    0 讨论(0)
提交回复
热议问题