“Large data” work flows using pandas

前端 未结 16 1779
被撕碎了的回忆
被撕碎了的回忆 2020-11-21 07:32

I have tried to puzzle out an answer to this question for many months while learning pandas. I use SAS for my day-to-day work and it is great for it\'s out-of-core support.

16条回答
  •  余生分开走
    2020-11-21 08:20

    I know this is an old thread but I think the Blaze library is worth checking out. It's built for these types of situations.

    From the docs:

    Blaze extends the usability of NumPy and Pandas to distributed and out-of-core computing. Blaze provides an interface similar to that of the NumPy ND-Array or Pandas DataFrame but maps these familiar interfaces onto a variety of other computational engines like Postgres or Spark.

    Edit: By the way, it's supported by ContinuumIO and Travis Oliphant, author of NumPy.

提交回复
热议问题