Create new dummy variable columns from categorical variable

后端 未结 8 1024
礼貌的吻别
礼貌的吻别 2020-11-28 04:03

I have a several data sets with 75,000 observations and a type variable that can take on a value 0-4. I want to add five new dummy variables to each data set f

相关标签:
8条回答
  • 2020-11-28 05:05

    What about using model.matrix()?

    > binom <- data.frame(data=runif(1e5),type=sample(0:4,1e5,TRUE))
    > head(binom)
           data type
    1 0.1412164    2
    2 0.8764588    2
    3 0.5559061    4
    4 0.3890109    3
    5 0.8725753    3
    6 0.8358100    1
    > inds <- model.matrix(~ factor(binom$type) - 1)
    > head(inds)
      factor(binom$type)0 factor(binom$type)1 factor(binom$type)2 factor(binom$type)3 factor(binom$type)4
    1                   0                   0                   1                   0                   0
    2                   0                   0                   1                   0                   0
    3                   0                   0                   0                   0                   1
    4                   0                   0                   0                   1                   0
    5                   0                   0                   0                   1                   0
    6                   0                   1                   0                   0                   0
    
    0 讨论(0)
  • 2020-11-28 05:05

    The recipes package can also be quite powerful to do this. The example below is quite verbose but it can be really clean as soon as you add more preprocessing steps.

    library(recipes)
    
    binom <- data.frame(y = runif(1e5), 
                        x = runif(1e5),
                        catVar = as.factor(sample(0:4, 1e5, TRUE))) # use the example from gappy
    head(binom)
    
    new_data <- recipe(y ~ ., data = binom) %>% 
      step_dummy(catVar) %>% # add dummy variable
      prep(training = binom) %>% # apply the preprocessing steps (could be more than just adding dummy variables)
      bake(newdata = binom) # apply the recipe to new data
    head(new_data)
    

    Other step examples are step_scale, step_center, step_pca, etc.

    0 讨论(0)
提交回复
热议问题