I have a 10 GB .dta Stata file and I am trying to read it into 64-bit R 3.3.1. I am working on a virtual machine with about 130 GB of RAM (4 TB HD) and the .dta file is abou
I recommend the haven R package. Unlike foreign
, It can read the latest Stata formats:
library(haven)
data <- read_dta('myfile.dta')
Not sure how fast it is compared to other options, but your choices for reading Stata files in R are rather limited. My understanding is that haven
wraps a C library, so it's probably your fastest option.
The fastest way to load a large Stata dataset in R is using the readstata13
package. I have compared the performance of foreign
, readstata13
, and haven
packages on a large dataset in this post and the results repeatedly showed that readstata13
is the fastest available package for reading Stata dataset in R.