In R usually data is loaded in RAM. Are there any packages which load the data in disk rather than RAM
Yes, the ff package can do that.
You may want to look at the Task View on High-Performance Computing for more details.
Check out the bigmemory
package, along with related packages like bigtabulate
, bigalgebra
, biganalytics
, and more. There's also ff
, though I don't find it as user-friendly as the bigmemory
suite. The bigmemory
suite was reportedly partially motivated by the difficulty of using ff
. I like it because it required very few changes to my code to be able to access a bigmatrix
object: it can be manipulated in almost exactly the same ways as a standard matrix, so my code is very reusable.
There's also support for HDF5 via NetCDF4, in packages like RNetCDF
and ncdf
. This is a popular, multi-platform, multi-language method for efficient storage and access of large data sets.
If you want basic memory mapping functionality, look at the mmap
package.