Python: handling a large set of data. Scipy or Rpy? And how?

前端 未结 6 578
我在风中等你
我在风中等你 2021-02-04 17:47

In my python environment, the Rpy and Scipy packages are already installed.

The problem I want to tackle is such:

1) A huge set of financial data are stored in

6条回答
  •  孤独总比滥情好
    2021-02-04 18:10

    As @gsk3 noted, bigmemory is a great package for this, along with the packages biganalytics and bigtabulate (there are more, but these are worth checking out). There's also ff, though that isn't as easy to use.

    Common to both R and Python is support for HDF5 (see the ncdf4 or NetCDF4 packages in R), which makes it very speedy and easy to access massive data sets on disk. Personally, I primarily use bigmemory, though that's R specific. As HDF5 is available in Python and is very, very fast, it's probably going to be your best bet in Python.

提交回复
热议问题