tarfile

Tarfile in Python: Can I untar more efficiently by extracting only some of the data?

泪湿孤枕 提交于 2019-12-17 19:46:11
问题 I am ordering a huge pile landsat scenes from the USGS, which come as tar.gz archives. I am writing a simple python script to unpack them. Each archive contains 15 tiff images from 60-120 mb in size, totalling just over 2 gb. I can easily extract an entire archive with the following code: import tarfile fileName = "LT50250232011160-SC20140922132408.tar.gz" tfile = tarfile.open(fileName, 'r:gz') tfile.extractall("newfolder/") I only actually need 6 of those 15 tiffs, identified as "bands" in

open .tar.gz archives in python

Deadly 提交于 2019-12-13 01:54:27
问题 ,i've got an issue with this pair of lines, coded in Python: #File name spacchetta_N_zip.py import tarfile import gzip with tarfile.TarFile.open(i, 'r:gz') as tarredgzippedFile: tarredgzippedFile.extractall(cartella) I'm sure i'm opening a .tar.gz archive. I really don't know what's wrong, i've surfed the web and this would work... but i got this error: File "spacchetta_N_zip.py", line 70, in Spacchetta with tarfile.TarFile.open(i, 'r:gz') as tarredgzippedFile: File "/usr/lib/python2.7

How does one add string to tarfile in Python3

巧了我就是萌 提交于 2019-12-12 11:24:52
问题 I have problem adding an str to a tar arhive in python. In python 2 I used such method: fname = "archive_name" params_src = "some arbitrarty string to be added to the archive" params_sio = io.StringIO(params_src) archive = tarfile.open(fname+".tgz", "w:gz") tarinfo = tarfile.TarInfo(name="params") tarinfo.size = len(params_src) archive.addfile(tarinfo, params_sio) Its essentially the same what can be found in this here. It worked well. However, going to python 3 it broke and results with the

Different name of unpacked tar.gz folder using tarfile.extractall()

流过昼夜 提交于 2019-12-11 12:33:40
问题 I am using pythons tarfile.extractall() to unpack a foo.tar.gz file. I want to access the extracted folder, but sometimes the extracted folder has a different name than the packed file. I need a way to control the name of the extracted folder or, a return value that tells me the name of the extracted folder. Example packed file: foo-rc-2.0.0.tar.gz unpacked folder: foo-2.0.0-rc 回答1: tarfile.extractall() simply extracts all the files in the tarball to the current directory, or a directory of

Extracting file from tarfile with only basename using Python

这一生的挚爱 提交于 2019-12-11 08:10:34
问题 I have a 'tafile' which contains files with complete path '/home/usr/path/to/file'. When I extract the file to the curent folder it creates the complete path recursively. Is there a way that I can extract the file with only the base name. 回答1: Use TarFile.extractfile() and write it into a file of your choice. 回答2: You can change the arcnames by hacking the TarInfo objects you get from Tarfile.getmembers() . Then you can use Tarfile.extractall to write the members to your chosen destination

How to compress a tar file in a tar.gz without directory?

房东的猫 提交于 2019-12-11 03:28:54
问题 I'm looking for a way to compress a tar file in a tar.gz without directory. Today my code generate a TAR file without directory with "tarfile" library and arcname arguments but when I want to compress this TAR file in TAR.GZ I don't understand how to delete directory. I have made many tests in the last 3 days. My code : Tarname = example.tar ImageDirectory = C:\... TarDirectory = C:\.. tar = tarfile.open(Tarname, "w") tar.add(ImageDirectory,arcname=TarName) tar.close() targz = tarfile.open(

Checking tarfile integrity in Python

ε祈祈猫儿з 提交于 2019-12-08 22:30:58
问题 I'm working on converting my backup script from shell to Python. One of the features of my old script was to check the created tarfile for integrity by doing: gzip -t . This seems to be a bit tricky in Python. It seems that the only way to do this, is by reading each of the compressed TarInfo objects within the tarfile. Is there a way to check a tarfile for integrity, without extracting to disk, or keeping it in memory (in it's entirety)? Good people on #python on freenode suggested that I

preserving file permission when creating a tarball with Python's tarfile

故事扮演 提交于 2019-12-07 21:57:01
问题 hello stackoverflowers, I want to preserve the original file permissions when using Python's tarfile module. I have quite a few executable files that lose their permissions once the tarball is extracted. I'm doing something like this: import tarfile tar = tarfile.open("mytarball.tar.gz", 'w:gz') tar.add('my_folder') #tar the entire folder tar.close() Then I copy it from windows to a linux machine (mapped with samba) using shutil: shutil.copy("mytarball.tar.gz",unix_dir) Then, to extract the

How to determine if data is valid tar file without a file?

▼魔方 西西 提交于 2019-12-07 03:25:19
问题 My upload form expects a tar file and I want to check whether the uploaded data is valid. The tarfile module supports is_tarfile(), but expects a filename - I don't want to waste resources writing the file to disk just to check if it is valid. Is there a way to check the data is a valid tar file without writing to disk, using standard Python libraries? 回答1: Say your uploaded data is contained in string data . from tarfile import TarFile, TarError from StringIO import StringIO sio = StringIO

Organizing files in tar bz2 file with python

[亡魂溺海] 提交于 2019-12-05 21:31:50
I have about 200,000 text files that are placed in a bz2 file. The issue I have is that when I scan the bz2 file to extract the data I need, it goes extremely slow. It has to look through the entire bz2 file to fine the single file I am looking for. Is there anyway to speed this up? Also, I thought about possibly organizing the files in the tar.bz2 so I can instead have it know where to look. Is there anyway to organize files that are put into a bz2? More Info/Edit: I need to query the compressed file for each textfile. Is there a better compression method that supports such a large number of