create a sparse matrix; given the indices of non-zero elements for creation of dummy variables of a categorical column of a large dataset

前端 未结 2 693
后悔当初
后悔当初 2021-01-24 13:48

I\'m trying to use a sparse matrix to generate dummy variables for a set of data with 5.8 million rows and two categorical columns.

The structure of the data is:

相关标签:
2条回答
  • 2021-01-24 14:34

    Why do you want a sparse matrix? For a dummy matrix you can also just use:

    model.matrix(~ . + 0, data = df)
    

    The 0 indicates no intercept and the . indicates that all categorical variables will be transformed. Be sure to set these variables as factors using as.factor() beforehand.

    0 讨论(0)
  • 2021-01-24 14:36

    Try this:

    spmat<-Matrix(0,nrow = 210000 ,ncol = 500,sparse = T)
    locs<-Matrix(data=c(mydata$Var_1,mydata$Var_2),byrow=F,ncol=2)
    spmat[locs]=1
    
    0 讨论(0)
提交回复
热议问题