R Reading in a zip data file without unzipping it

后端 未结 7 2126
青春惊慌失措
青春惊慌失措 2020-12-04 15:22

I have a very large zip file and i am trying to read it into R without unzipping it like so:

temp <- tempfile(\"Sales\", fileext=c(\"zip\"))
data <- re         


        
相关标签:
7条回答
  • 2020-12-04 15:36

    The gzfile function along with read_csv and read.table can read compressed files.

    library(readr)
    df = read_csv(gzfile("file.csv.gz"))
    
    library(data.table)
    df = read.table(gzfile("file.csv.gz"))
    

    read_csv from the readr package can read compressed files even without using gzfile function.

    library(readr)  
    df = read_csv("file.csv.gz")
    

    read_csv is recommended because it is faster than read.table

    0 讨论(0)
  • 2020-12-04 15:38

    In this expression you lost a dot

    temp <- tempfile("Sales", fileext=c("zip"))
    

    It should be:

    temp <- tempfile("Sales", fileext=c(".zip"))
    
    0 讨论(0)
  • 2020-12-04 15:48

    This should work just fine if the file is sales.csv.

    data <- readr::read_csv(unzip("Sales.zip", "Sales.csv"))
    

    To check the filename without extracting the file. This works

    unzip("sales.zip", list = TRUE)
    
    0 讨论(0)
  • 2020-12-04 15:49

    If you have zcat installed on your system (which is the case for linux, macos, and cygwin) you could also use:

    zipfile<-"test.zip"
    myData <- read.delim(pipe(paste("zcat", zipfile)))
    

    This solution also has the advantage that no temporary files are created.

    0 讨论(0)
  • The methods of the readr package also support compressed files if the file suffix indicates the nature of the file, that is files ending in .gz, .bz2, .xz, or .zip will be automatically uncompressed.

    require(readr)
    myData <- read_csv("foo.txt.gz")
    
    0 讨论(0)
  • 2020-12-04 15:57

    If your zip file is called Sales.zip and contains only a file called Sales.dat, I think you can simply do the following (assuming the file is in your working directory):

    data <- read.table(unz("Sales.zip", "Sales.dat"), nrows=10, header=T, quote="\"", sep=",")
    
    0 讨论(0)
提交回复
热议问题