How can I pass large arrays between numpy and R?

后端 未结 3 1848
后悔当初
后悔当初 2021-01-04 23:03

I\'m using python and numpy/scipy to do regex and stemming for a text processing application. But I want to use some of R\'s statistical packages as well.

What\'s th

相关标签:
3条回答
  • 2021-01-04 23:22

    Use Rpy, http://rpy.sourceforge.net/, to call R from Python.

    The caveat is that both R and Python versions need to be exactly the one for which the Rpy binary has been built. You thus need to be careful with the installation.

    0 讨论(0)
  • 2021-01-04 23:39
    • Have you already looked into RPy? It's a python interface to R. I guess that would spare you the data handling.

    • To backup your NumPy arrays you can use pickle. As it seems to create a lot of overhead when saving huge data, NumPy arrays are best saved using the HDF standard. Here's a article covering that: http://www.shocksolution.com/2010/01/10/storing-large-numpy-arrays-on-disk-python-pickle-vs-hdf5adsf/

    0 讨论(0)
  • 2021-01-04 23:43

    I cannot comment on "large data" between shared between R and Python, but I have had a much easier time working with pyRserve than RPy or RPy2.

    That being said, I am curious about the text processing you are doing? Python obviously has a lot to offer on the text processing side, but statistically there is a lot too in packages like NLTK and the Pattern package from CLiPS. Are you just more comfortable doing stats in R, or is there something specific missing in Python?

    0 讨论(0)
提交回复
热议问题