问题
If I have constructed a sparse matrix using the sparse(i, j, k) constructor, how can I then normalize the columns of the matrix (so that each column sums to 1)? I cannot efficiently normalize the entries before I create the matrix, so any help is appreciated. Thanks!
回答1:
The easiest way would be a broadcasting division by the sum of the columns:
julia> A = sprand(4,5,.5)
A./sum(A,1)
4x5 Array{Float64,2}:
0.0 0.0989976 0.0 0.0 0.0795486
0.420754 0.458653 0.0986313 0.0 0.0
0.0785525 0.442349 0.0 0.856136 0.920451
0.500693 0.0 0.901369 0.143864 0.0
… but it looks like that hasn't been optimized for sparse matrices yet, and falls back to a full matrix. So a simple loop to iterate over the columns does the trick:
julia> for (col,s) in enumerate(sum(A,1))
s == 0 && continue # What does a "normalized" column with a sum of zero look like?
A[:,col] = A[:,col]/s
end
A
4x5 sparse matrix with 12 Float64 entries:
[2, 1] = 0.420754
[3, 1] = 0.0785525
[4, 1] = 0.500693
[1, 2] = 0.0989976
[2, 2] = 0.458653
[3, 2] = 0.442349
[2, 3] = 0.0986313
[4, 3] = 0.901369
[3, 4] = 0.856136
[4, 4] = 0.143864
[1, 5] = 0.0795486
[3, 5] = 0.920451
julia> sum(A,1)
1x5 Array{Float64,2}:
1.0 1.0 1.0 1.0 1.0
This works entirely within sparse matrices and is done in-place (although it is still allocating new sparse matrices for each column slice).
回答2:
Given a Matrix A (does not matter whether or not it is sparse) normalize by any dimension
A ./ sum(A,1) or A ./ sum(A,2)
to show that it works:
A = sprand(10,10,0.3)
println(sum(A,1))
println(A ./ sum(A,1))
only caveat
A[1,:] = 0
println(A ./ sum(A,1))
as you can see the column 1 now only contains NaNs because we divide by zero. Also we end up with a Matrix and not a sparse Matrix.
On the other hand one can quickly come up with an efficient specialized solution for your problem.
function normalize_columns(A :: SparseMatrixCSC)
sums = sum(A,1)
I,J,V = findnz(A)
for idx in 1:length(V)
V[idx] /= sums[J[idx]]
end
sparse(I,J,V)
end
@Matt B came up with a very similar answer while I was typing this up :)
回答3:
Remember that sparse matrices in Julia are in compressed column form. So you can access the data directly:
for col = 1 : size(A, 2)
i = A.colptr[col]
k = A.colptr[col+1] - 1
n = i <= k ? norm(A.nzval[i:k]) : 0.0 # or whatever you like
n > 0.0 && (A.nzval[i:k] ./= n)
end
回答4:
the following gives what you want: A = sprand(4,5,0.5) B = A./sparse(sum(A,1))
The problem is that sum(A,1) gives a 1x5 dense array so combining with the sparse matrix A through the ./ operator gives a dense array. So you need to force it to be of sparse type. Or you can type sparse(A ./ sum(A,1)).
回答5:
# get the column sums of A
S = vec(sum(A,1))
# get the nonzero entries in A. ei is row index, ej is col index, ev is the value in A
ei,ej,ev = findnz(A)
# get the number or rows and columns in A
m,n = size(A)
# create a new normalized matrix. For each nonzero index (ei,ej), its new value will be
# the old value divided by the sum of that column, which can be obtained by S[ej]
A_normalized = sparse(ei,ej,ev./S[ej],m,n)
来源:https://stackoverflow.com/questions/24296856/in-julia-how-can-i-column-normalize-a-sparse-matrix