how to download a large binary file with RCurl *after* server authentication

前端 未结 2 1659
南笙
南笙 2020-12-17 19:49

i originally asked this question about performing this task with the httr package, but i don\'t think it\'s possible using httr. so i\'ve re-writt

相关标签:
2条回答
  • 2020-12-17 20:12

    this is now possible with the httr package. thanks hadley!

    https://github.com/hadley/httr/issues/44

    0 讨论(0)
  • 2020-12-17 20:14
    1. From this link create a file named curl_writer.c and save it to C:\<folder where you save your R files>

      #include <stdio.h>
      
      /**
       * Original code just sent some message to stderr
       */
      size_t writer(void *buffer, size_t size, size_t nmemb, void *stream) {
          fwrite(buffer,size,nmemb,(FILE *)stream);
          return size * nmemb;
      }
      
    2. Open a command window, go to the folder where you saved curl_writer.c and run the R compiler

      c:> cd "C:\<folder where you save your R files>"
      c:> R CMD SHLIB -o curl_writer.dll curl_writer.c
      
    3. Open R and run your script

      C:> R
      
      your.email <- "email@address.com"
      your.password <- "password"
      extract.path <- "https://usa.ipums.org/usa-action/downloads/extract_files/some_file.csv.gz"
      
      library(RCurl)
      
      values <- 
          list(
              "login[email]" = your.email , 
              "login[password]" = your.password , 
              "login[is_for_login]" = 1
          )
      
      curl = getCurlHandle()
      
      curlSetOpt(
          cookiejar = 'cookies.txt', 
          followlocation = TRUE, 
          autoreferer = TRUE, 
          ssl.verifypeer = FALSE,
          curl = curl
      )
      
      params <- 
          list(
              "login[email]" = your.email , 
              "login[password]" = your.password , 
              "login[is_for_login]" = 1
          )
      
      html <- postForm("https://usa.ipums.org/usa-action/users/validate_login", .params = params, curl = curl)
      dl <- getURL( "https://usa.ipums.org/usa-action/extract_requests/download" , curl = curl)
      
      # Load the DLL you created
      # "writer" is the name of the function
      # "curl_writer" is the name of the dll
      dyn.load("curl_writer.dll")
      writer <- getNativeSymbolInfo("writer", PACKAGE="curl_writer")$address
      
      # Note that "URL" parameter is upper case, in your code it is lowercase
      # I'm not sure if that has something to do
      # "writer" is the symbol defined above
      f <- CFILE(filename <- tempfile(), "wb")
      curlPerform(URL=url, writedata=f@ref, writefunction=writer, curl=curl)
      close(f)
      
    0 讨论(0)
提交回复
热议问题