Basically, what is the best way to go about storing and using dense matrices in python?
I have a project that generates similarity metrics between every item in an array
Well, I've found my solution:
h5py
It's a library that basically presents a numpy-like interface, but uses compressed memmapped files to store arrays of arbitrary size (It's basically a wrapper for HDF5).
PyTables is built on it, and PyTables actually led me to it. However, I do not need any of the SQL functionality that is the main offering of PyTables, and PyTables does not provide the clean array-like interface I was really looking for.
h5py basically acts like a numpy array, and simply stores the data in a different format.
It also seems to have no limit to array size, except perhaps disk space. I'm currently doing testing on a 100,000 * 100,000 array of uint16.