gzip | 易学教程

Get file names of tarred folder contents in Python

阅读更多关于 Get file names of tarred folder contents in Python

问题 I have a compressed folder called gziptest.tar.gz which contains several plaintext files. I'd like to be able to get the filenames and corresponding contents of the files, but the examples of usage for the gzip library don't cover this. The following code: import gzip in_f = gzip.open('/home/cholloway/gziptest.tar.gz') print in_f.read() produces the output: gzip test/file2000664 001750 001750 00000000016 12621163624 015761 0ustar00chollowaycholloway000000 000000 I like apples gzip test

Ho to read “.gz” compressed file using spark DF or DS?

阅读更多关于 Ho to read “.gz” compressed file using spark DF or DS?

问题 I have a compressed file with .gz format, Is it possible to read the file directly using spark DF/DS? Details : File is csv with tab delimited. 回答1: Reading a compressed csv is done in the same way as reading an uncompressed csv file. For Spark version 2.0+ it can be done as follows using Scala (note the extra option for the tab delimiter): val df = spark.read.option("sep", "\t").csv("file.csv.gz") PySpark: df = spark.read.csv("file.csv.gz", sep='\t') The only extra consideration to take into

nginx gzip compression not working

阅读更多关于 nginx gzip compression not working

问题 I have no idea where to place my gzip compression lines within my http block, shown here. http { default_type application/octet-stream; include /etc/nginx/mime.types; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; keepalive_timeout 65; server { listen 8080; root /usr/share/nginx; location / { root /usr/share/nginx/html; try_files $uri

nginx gzip compression not working

阅读更多关于 nginx gzip compression not working

How to obtain random access of a gzip compressed file

阅读更多关于 How to obtain random access of a gzip compressed file

问题 According to this FAQ on zlib.net it is possible to: access data randomly in a compressed stream I know about the module Bio.bgzf of Biopyton 1.60, which: supports reading and writing BGZF files (Blocked GNU Zip Format), a variant of GZIP with efficient random access, most commonly used as part of the BAM file format and in tabix. This uses Python’s zlib library internally, and provides a simple interface like Python’s gzip library. But for my use case I don't want to use that format.

How to obtain random access of a gzip compressed file

阅读更多关于 How to obtain random access of a gzip compressed file

GZipStream from MemoryStream only returns a few hundred bytes

阅读更多关于 GZipStream from MemoryStream only returns a few hundred bytes

问题 I am trying to download a .gz file of a few hundred MBs, and turn it into a very long string in C#. using (var memstream = new MemoryStream(new WebClient().DownloadData(url))) using (GZipStream gs = new GZipStream(memstream, CompressionMode.Decompress)) using (var outmemstream = new MemoryStream()) { gs.CopyTo(outmemstream); string t = Encoding.UTF8.GetString(outmemstream.ToArray()); Console.WriteLine(t); } My test URL: https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2017-47/segments

Can I POST data with python requests lib with http-gzip or deflate compression?

阅读更多关于 Can I POST data with python requests lib with http-gzip or deflate compression?

问题 I use the request-module of python 2.7 to post a bigger chunk of data to a service I can't change. Since the data is mostly text, it is large but would compress quite well. The server would accept gzip- or deflate-encoding, however I do not know how to instruct requests to do a POST and encode the data correctly automatically. Is there a minimal example available, that shows how this is possible? 回答1: # Works if backend supports gzip additional_headers['content-encoding'] = 'gzip' request

Downloading a csv.gz file from url in Python

阅读更多关于 Downloading a csv.gz file from url in Python

问题 I'm having trouble downloading a csv.gz file from a url I have no problem downloading a tar.gz file. For the csv.gz file I'm able to extract the .gz file and read my csv file it would just be handy if I could use an URL instead of having the csv-1.0.csv.gz before hand This works: import urllib.request urllib.request.urlretrieve('http://www.mywebsite.com/csv-1-0.tar.gz','csv-1-0.tar.gz') This does not work: import urllib.request urllib.request.urlretrieve('http://www.mywebsite.com/csv-1-0.csv

How to create an empty tgz file?

阅读更多关于 How to create an empty tgz file?

问题 How to create an empty tgz file? I tried tar czvf /tmp/empty.tgz --from-file /dev/null tar: Option --from-file is not supported 回答1: The switch you're looking for is --files-from or -T : tar czvf /tmp/empty.tgz --files-from=/dev/null 来源： https://stackoverflow.com/questions/44807644/how-to-create-an-empty-tgz-file