Handling large dense matrices in python

前端 未结 6 1171
[愿得一人]
[愿得一人] 2021-02-04 21:33

Basically, what is the best way to go about storing and using dense matrices in python?

I have a project that generates similarity metrics between every item in an array

6条回答
  •  走了就别回头了
    2021-02-04 21:54

    Well, I've found my solution:
    h5py

    It's a library that basically presents a numpy-like interface, but uses compressed memmapped files to store arrays of arbitrary size (It's basically a wrapper for HDF5).

    PyTables is built on it, and PyTables actually led me to it. However, I do not need any of the SQL functionality that is the main offering of PyTables, and PyTables does not provide the clean array-like interface I was really looking for.

    h5py basically acts like a numpy array, and simply stores the data in a different format.

    It also seems to have no limit to array size, except perhaps disk space. I'm currently doing testing on a 100,000 * 100,000 array of uint16.

提交回复
热议问题