I need to make a strategic decision about choice of the basis for data structure holding statistical data frames in my program.
I store hundreds of thousands of records
pandas.DataFrame
is awesome, and interacts very well with much of numpy. Much of the DataFrame
is written in Cython and is quite optimized. I suspect the ease of use and the richness of the Pandas API will greatly outweigh any potential benefit you could obtain by rolling your own interfaces around numpy.