How to handle huge sparse matrices construction using Scipy?

老子叫甜甜 提交于 2019-12-05 18:31:29

Scipy offers several implementations of sparse matrices. Each of them has its own advantages and disadvantages. You can find information about the matrix formats here:

There are several ways to get to your desired sparse matrix. Computing the full NxN matrix and then converting is probably not possible, due high memory requirements (about 10^12 entries!).

In your case I would prepare your data to construct a coo_matrix.

coo_matrix((data, (i, j)), [shape=(M, N)])

data[:] the entries of the matrix, in any order
i[:] the row indices of the matrix entries
j[:] the column indices of the matrix entries

You might also want to have a look at lil_matrix, which can be used to incrementally build your matrix.

Once you created the matrix you can then convert it to a better suited format for calculation, depending on your use case.

I do not recognize the data format, there might be parsers for it, there might not. Writing your own parser should not be very difficult, though. Each line containing a colon starts a new row, all indices after the colon and in consecutive lines without colons are the column entries for said row.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!