Using R to download gzipped data file, extract, and import data

后端 未结 3 1830
旧时难觅i
旧时难觅i 2020-12-05 08:57

A follow up to this question: How can I download and uncompress a gzipped file using R? For example (from the UCI Machine Learning Repository), I have a file of insurance d

相关标签:
3条回答
  • 2020-12-05 09:17

    Please the content of help(download.file) for that. If the file in question is merely a gzipped but otherwise readable file, you can feed the complete URL to read.table() et al too.

    0 讨论(0)
  • 2020-12-05 09:25

    Here is a quick way to do it.

    # create download directory and set it
    .exdir = '~/Desktop/tmp'
    dir.create(.exdir)
    .file = file.path(.exdir, 'tic.tar.gz')
    
    # download file
    url = 'http://archive.ics.uci.edu/ml/databases/tic/tic.tar.gz'
    download.file(url, .file)
    
    # untar it
    untar(.file, compressed = 'gzip', exdir = path.expand(.exdir))
    
    0 讨论(0)
  • 2020-12-05 09:30

    I like Ramnath's approach, but I would use temp files like so:

    tmpdir <- tempdir()
    
    url <- 'http://archive.ics.uci.edu/ml/databases/tic/tic.tar.gz'
    file <- basename(url)
    download.file(url, file)
    
    untar(file, compressed = 'gzip', exdir = tmpdir )
    list.files(tmpdir)
    

    The list.files() should produce something like this:

    [1] "TicDataDescr.txt" "dictionary.txt"   "ticdata2000.txt"  "ticeval2000.txt"  "tictgts2000.txt" 
    

    which you could parse if you needed to automate this process for a lot of files.

    0 讨论(0)
提交回复
热议问题