Error when making a sparse matrix

后端未结

关注

 1  460

刺人心 2021-01-20 17:54

I am facing a problem I do not understand. It\'s a follow-up on answers suggested here and here

I have two identically structured datasets. One I created as a reprod

1条回答

梦毁少年i (楼主)

2021-01-20 18:49

We can convert the 'pid', 'cid' columns to factor and coerce back to numeric or use match with unique values of each column to get the row/column index and this should work in creating sparseMatrix.

test1 <- test[, lapply(.SD, function(x) 
                 as.numeric(factor(x, levels=unique(x))))]

Or we use match

test1 <- test[, lapply(.SD, function(x) match(x, unique(x)))]

s1 <- sparseMatrix(test1$pid,test1$cid,dimnames = list(unique(test$pid), 
                 unique(test$cid)),x = 1)
dim(s1)
#[1] 15 50

s1[1:3, 1:3]
#3 x 3 sparse Matrix of class "dgCMatrix"
#    11023 11787 14232
#204     1     1     .
#207     .     .     1
#254     .     .     .

head(test)
#   pid   cid
#1: 204 11023
#2: 204 11787
#3: 207 14232
#4: 254 14470
#5: 254 14480
#6: 258  1290

EDIT:

If we want this for the full row/column index specified in 'test', we need to make the dimnames as the same length as the max of 'pid', 'cid'

rnm <- seq(max(test$pid))
cnm <- seq(max(test$cid))
s2 <- sparseMatrix(test$pid, test$cid, dimnames=list(rnm, cnm))
dim(s2)
#[1]  1561 30627
s2[1:3, 1:3]
#3 x 3 sparse Matrix of class "ngCMatrix"
# 1 2 3
#1 . . .
#2 . . .
#3 . . .

0 讨论(0)