gzip

Get file names of tarred folder contents in Python

戏子无情 提交于 2020-05-30 06:32:19
问题 I have a compressed folder called gziptest.tar.gz which contains several plaintext files. I'd like to be able to get the filenames and corresponding contents of the files, but the examples of usage for the gzip library don't cover this. The following code: import gzip in_f = gzip.open('/home/cholloway/gziptest.tar.gz') print in_f.read() produces the output: gzip test/file2000664 001750 001750 00000000016 12621163624 015761 0ustar00chollowaycholloway000000 000000 I like apples gzip test

Ho to read “.gz” compressed file using spark DF or DS?

人走茶凉 提交于 2020-05-29 05:11:16
问题 I have a compressed file with .gz format, Is it possible to read the file directly using spark DF/DS? Details : File is csv with tab delimited. 回答1: Reading a compressed csv is done in the same way as reading an uncompressed csv file. For Spark version 2.0+ it can be done as follows using Scala (note the extra option for the tab delimiter): val df = spark.read.option("sep", "\t").csv("file.csv.gz") PySpark: df = spark.read.csv("file.csv.gz", sep='\t') The only extra consideration to take into

nginx gzip compression not working

半城伤御伤魂 提交于 2020-05-28 07:25:09
问题 I have no idea where to place my gzip compression lines within my http block, shown here. http { default_type application/octet-stream; include /etc/nginx/mime.types; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; keepalive_timeout 65; server { listen 8080; root /usr/share/nginx; location / { root /usr/share/nginx/html; try_files $uri

nginx gzip compression not working

生来就可爱ヽ(ⅴ<●) 提交于 2020-05-28 07:25:07
问题 I have no idea where to place my gzip compression lines within my http block, shown here. http { default_type application/octet-stream; include /etc/nginx/mime.types; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; keepalive_timeout 65; server { listen 8080; root /usr/share/nginx; location / { root /usr/share/nginx/html; try_files $uri

How to obtain random access of a gzip compressed file

不问归期 提交于 2020-05-27 06:25:26
问题 According to this FAQ on zlib.net it is possible to: access data randomly in a compressed stream I know about the module Bio.bgzf of Biopyton 1.60, which: supports reading and writing BGZF files (Blocked GNU Zip Format), a variant of GZIP with efficient random access, most commonly used as part of the BAM file format and in tabix. This uses Python’s zlib library internally, and provides a simple interface like Python’s gzip library. But for my use case I don't want to use that format.

How to obtain random access of a gzip compressed file

霸气de小男生 提交于 2020-05-27 06:24:17
问题 According to this FAQ on zlib.net it is possible to: access data randomly in a compressed stream I know about the module Bio.bgzf of Biopyton 1.60, which: supports reading and writing BGZF files (Blocked GNU Zip Format), a variant of GZIP with efficient random access, most commonly used as part of the BAM file format and in tabix. This uses Python’s zlib library internally, and provides a simple interface like Python’s gzip library. But for my use case I don't want to use that format.

GZipStream from MemoryStream only returns a few hundred bytes

大憨熊 提交于 2020-05-26 03:57:52
问题 I am trying to download a .gz file of a few hundred MBs, and turn it into a very long string in C#. using (var memstream = new MemoryStream(new WebClient().DownloadData(url))) using (GZipStream gs = new GZipStream(memstream, CompressionMode.Decompress)) using (var outmemstream = new MemoryStream()) { gs.CopyTo(outmemstream); string t = Encoding.UTF8.GetString(outmemstream.ToArray()); Console.WriteLine(t); } My test URL: https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2017-47/segments

Can I POST data with python requests lib with http-gzip or deflate compression?

懵懂的女人 提交于 2020-05-25 12:03:07
问题 I use the request-module of python 2.7 to post a bigger chunk of data to a service I can't change. Since the data is mostly text, it is large but would compress quite well. The server would accept gzip- or deflate-encoding, however I do not know how to instruct requests to do a POST and encode the data correctly automatically. Is there a minimal example available, that shows how this is possible? 回答1: # Works if backend supports gzip additional_headers['content-encoding'] = 'gzip' request

Downloading a csv.gz file from url in Python

你说的曾经没有我的故事 提交于 2020-05-13 14:02:11
问题 I'm having trouble downloading a csv.gz file from a url I have no problem downloading a tar.gz file. For the csv.gz file I'm able to extract the .gz file and read my csv file it would just be handy if I could use an URL instead of having the csv-1.0.csv.gz before hand This works: import urllib.request urllib.request.urlretrieve('http://www.mywebsite.com/csv-1-0.tar.gz','csv-1-0.tar.gz') This does not work: import urllib.request urllib.request.urlretrieve('http://www.mywebsite.com/csv-1-0.csv

How to create an empty tgz file?

人走茶凉 提交于 2020-04-13 06:57:28
问题 How to create an empty tgz file? I tried tar czvf /tmp/empty.tgz --from-file /dev/null tar: Option --from-file is not supported 回答1: The switch you're looking for is --files-from or -T : tar czvf /tmp/empty.tgz --files-from=/dev/null 来源: https://stackoverflow.com/questions/44807644/how-to-create-an-empty-tgz-file