I\'m using python and numpy/scipy to do regex and stemming for a text processing application. But I want to use some of R\'s statistical packages as well.
What\'s th
Use Rpy, http://rpy.sourceforge.net/, to call R from Python.
The caveat is that both R and Python versions need to be exactly the one for which the Rpy binary has been built. You thus need to be careful with the installation.
Have you already looked into RPy? It's a python interface to R. I guess that would spare you the data handling.
To backup your NumPy arrays you can use pickle. As it seems to create a lot of overhead when saving huge data, NumPy arrays are best saved using the HDF standard. Here's a article covering that: http://www.shocksolution.com/2010/01/10/storing-large-numpy-arrays-on-disk-python-pickle-vs-hdf5adsf/
I cannot comment on "large data" between shared between R and Python, but I have had a much easier time working with pyRserve than RPy or RPy2.
That being said, I am curious about the text processing you are doing? Python obviously has a lot to offer on the text processing side, but statistically there is a lot too in packages like NLTK and the Pattern package from CLiPS. Are you just more comfortable doing stats in R, or is there something specific missing in Python?