While looking around for ideas I found https://stackoverflow.com/a/54222447/264822 for zip files which I think is a very clever solution. But it relies on zip files having a
My mistake. I'm actually dealing with tar.gz files but I assumed that zip and tar.gz are similar. They're not - tar is an archive file which is then compressed as gzip, so to read the tar you have to decompress it first. My idea of pulling bits out of the tar file won't work.
What does work is:
s3_object = s3client.get_object(Bucket=bucket_name, Key=file_name)
wholefile = s3_object['Body'].read()
fileobj = io.BytesIO(wholefile)
tarf = tarfile.open(fileobj=fileobj)
names = tarf.getnames()
for name in names:
print(name)
I suspect the original code will work for a tar file but I don't have any to try it on.