I have a sparse matrix which is transformed from sklearn tfidfVectorier. I believe that some rows are all-zero rows. I want to remove them. However, as far as I know, the existi
Slicing + getnnz()
does the trick:
M = M[M.getnnz(1)>0]
Works directly on csr_array
.
You can also remove all 0 columns without changing formats:
M = M[:,M.getnnz(0)>0]
However if you want to remove both you need
M = M[M.getnnz(1)>0][:,M.getnnz(0)>0] #GOOD
I am not sure why but
M = M[M.getnnz(1)>0, M.getnnz(0)>0] #BAD
does not work.