This is a typical use case for FEM/FVM equation systems, so is perhaps of broader interest. From a triangular mesh à la
I would like to create a scipy
I would try creating the csr structure directly, especially if you are resorting to np.unique
since this gives you sorted keys, which is half the job done.
I'm assuming you are at the point where you have i, j
sorted lexicographically and overlapping v
summed using np.add.at
on the optional inverse
output of np.unique
.
Then v
and j
are already in csr format. All that's left to do is creating the indptr
which you simply get by np.searchsorted(i, np.arange(M+1))
where M
is the column length. You can pass these directly to the sparse.csr_matrix
constructor.
Ok, let code speak:
import numpy as np
from scipy import sparse
from timeit import timeit
def tocsr(I, J, E, N):
n = len(I)
K = np.empty((n,), dtype=np.int64)
K.view(np.int32).reshape(n, 2).T[...] = J, I
S = np.argsort(K)
KS = K[S]
steps = np.flatnonzero(np.r_[1, np.diff(KS)])
ED = np.add.reduceat(E[S], steps)
JD, ID = KS[steps].view(np.int32).reshape(-1, 2).T
ID = np.searchsorted(ID, np.arange(N+1))
return sparse.csr_matrix((ED, np.array(JD, dtype=int), ID), (N, N))
def viacoo(I, J, E, N):
return sparse.coo_matrix((E, (I, J)), (N, N)).tocsr()
#testing and timing
# correctness
N = 1000
A = np.random.random((N, N)) < 0.001
I, J = np.where(A)
E = np.random.random((2, len(I)))
D = np.zeros((2,) + A.shape)
D[:, I, J] = E
D2 = tocsr(np.r_[I, I], np.r_[J, J], E.ravel(), N).A
print('correct:', np.allclose(D.sum(axis=0), D2))
# speed
N = 100000
K = 10
I, J = np.random.randint(0, N, (2, K*N))
E = np.random.random((2 * len(I),))
I, J, E = np.r_[I, I, J, J], np.r_[J, J, I, I], np.r_[E, E]
print('N:', N, ' -- nnz (with duplicates):', len(E))
print('direct: ', timeit('f(a,b,c,d)', number=10, globals={'f': tocsr, 'a': I, 'b': J, 'c': E, 'd': N}), 'secs for 10 iterations')
print('via coo:', timeit('f(a,b,c,d)', number=10, globals={'f': viacoo, 'a': I, 'b': J, 'c': E, 'd': N}), 'secs for 10 iterations')
Prints:
correct: True
N: 100000 -- nnz (with duplicates): 4000000
direct: 7.702431229001377 secs for 10 iterations
via coo: 41.813509466010146 secs for 10 iterations
Speedup: 5x