Error when making a sparse matrix

后端 未结 1 457
刺人心
刺人心 2021-01-20 17:54

I am facing a problem I do not understand. It\'s a follow-up on answers suggested here and here

I have two identically structured datasets. One I created as a reprod

1条回答
  •  梦毁少年i
    2021-01-20 18:49

    We can convert the 'pid', 'cid' columns to factor and coerce back to numeric or use match with unique values of each column to get the row/column index and this should work in creating sparseMatrix.

    test1 <- test[, lapply(.SD, function(x) 
                     as.numeric(factor(x, levels=unique(x))))]
    

    Or we use match

    test1 <- test[, lapply(.SD, function(x) match(x, unique(x)))]
    
    s1 <- sparseMatrix(test1$pid,test1$cid,dimnames = list(unique(test$pid), 
                     unique(test$cid)),x = 1)
    dim(s1)
    #[1] 15 50
    
    s1[1:3, 1:3]
    #3 x 3 sparse Matrix of class "dgCMatrix"
    #    11023 11787 14232
    #204     1     1     .
    #207     .     .     1
    #254     .     .     .
    
    head(test)
    #   pid   cid
    #1: 204 11023
    #2: 204 11787
    #3: 207 14232
    #4: 254 14470
    #5: 254 14480
    #6: 258  1290
    

    EDIT:

    If we want this for the full row/column index specified in 'test', we need to make the dimnames as the same length as the max of 'pid', 'cid'

    rnm <- seq(max(test$pid))
    cnm <- seq(max(test$cid))
    s2 <- sparseMatrix(test$pid, test$cid, dimnames=list(rnm, cnm))
    dim(s2)
    #[1]  1561 30627
    s2[1:3, 1:3]
    #3 x 3 sparse Matrix of class "ngCMatrix"
    # 1 2 3
    #1 . . .
    #2 . . .
    #3 . . .
    

    0 讨论(0)
提交回复
热议问题