Generate a dummy-variable

前端 未结 17 1064
遇见更好的自我
遇见更好的自我 2020-11-21 11:41

I have trouble generating the following dummy-variables in R:

I\'m analyzing yearly time series data (time period 1948-2009). I have two questions:

  1. <
相关标签:
17条回答
  • 2020-11-21 12:03

    If you want to get K dummy variables, instead of K-1, try:

    dummies = table(1:length(year),as.factor(year))  
    

    Best,

    0 讨论(0)
  • 2020-11-21 12:06

    I read this on the kaggle forum:

    #Generate example dataframe with character column
    example <- as.data.frame(c("A", "A", "B", "F", "C", "G", "C", "D", "E", "F"))
    names(example) <- "strcol"
    
    #For every unique value in the string column, create a new 1/0 column
    #This is what Factors do "under-the-hood" automatically when passed to function requiring numeric data
    for(level in unique(example$strcol)){
      example[paste("dummy", level, sep = "_")] <- ifelse(example$strcol == level, 1, 0)
    }
    
    0 讨论(0)
  • 2020-11-21 12:10

    Convert your data to a data.table and use set by reference and row filtering

    library(data.table)
    
    dt <- as.data.table(your.dataframe.or.whatever)
    dt[, is.1957 := 0]
    dt[year == 1957, is.1957 := 1]
    

    Proof-of-concept toy example:

    library(data.table)
    
    dt <- as.data.table(cbind(c(1, 1, 1), c(2, 2, 3)))
    dt[, is.3 := 0]
    dt[V2 == 3, is.3 := 1]
    
    0 讨论(0)
  • 2020-11-21 12:10

    I use such a function (for data.table):

    # Ta funkcja dla obiektu data.table i zmiennej var.name typu factor tworzy dummy variables o nazwach "var.name: (level1)"
    factorToDummy <- function(dtable, var.name){
      stopifnot(is.data.table(dtable))
      stopifnot(var.name %in% names(dtable))
      stopifnot(is.factor(dtable[, get(var.name)]))
    
      dtable[, paste0(var.name,": ",levels(get(var.name)))] -> new.names
      dtable[, (new.names) := transpose(lapply(get(var.name), FUN = function(x){x == levels(get(var.name))})) ]
    
      cat(paste("\nDodano zmienne dummy: ", paste0(new.names, collapse = ", ")))
    }
    

    Usage:

    data <- data.table(data)
    data[, x:= droplevels(x)]
    factorToDummy(data, "x")
    
    0 讨论(0)
  • 2020-11-21 12:11

    This one liner in base R

    model.matrix( ~ iris$Species - 1)
    

    gives

        iris$Speciessetosa iris$Speciesversicolor iris$Speciesvirginica
    1                    1                      0                     0
    2                    1                      0                     0
    3                    1                      0                     0
    4                    1                      0                     0
    5                    1                      0                     0
    6                    1                      0                     0
    7                    1                      0                     0
    8                    1                      0                     0
    9                    1                      0                     0
    10                   1                      0                     0
    11                   1                      0                     0
    12                   1                      0                     0
    13                   1                      0                     0
    14                   1                      0                     0
    15                   1                      0                     0
    16                   1                      0                     0
    17                   1                      0                     0
    18                   1                      0                     0
    19                   1                      0                     0
    20                   1                      0                     0
    21                   1                      0                     0
    22                   1                      0                     0
    23                   1                      0                     0
    24                   1                      0                     0
    25                   1                      0                     0
    26                   1                      0                     0
    27                   1                      0                     0
    28                   1                      0                     0
    29                   1                      0                     0
    30                   1                      0                     0
    31                   1                      0                     0
    32                   1                      0                     0
    33                   1                      0                     0
    34                   1                      0                     0
    35                   1                      0                     0
    36                   1                      0                     0
    37                   1                      0                     0
    38                   1                      0                     0
    39                   1                      0                     0
    40                   1                      0                     0
    41                   1                      0                     0
    42                   1                      0                     0
    43                   1                      0                     0
    44                   1                      0                     0
    45                   1                      0                     0
    46                   1                      0                     0
    47                   1                      0                     0
    48                   1                      0                     0
    49                   1                      0                     0
    50                   1                      0                     0
    51                   0                      1                     0
    52                   0                      1                     0
    53                   0                      1                     0
    54                   0                      1                     0
    55                   0                      1                     0
    56                   0                      1                     0
    57                   0                      1                     0
    58                   0                      1                     0
    59                   0                      1                     0
    60                   0                      1                     0
    61                   0                      1                     0
    62                   0                      1                     0
    63                   0                      1                     0
    64                   0                      1                     0
    65                   0                      1                     0
    66                   0                      1                     0
    67                   0                      1                     0
    68                   0                      1                     0
    69                   0                      1                     0
    70                   0                      1                     0
    71                   0                      1                     0
    72                   0                      1                     0
    73                   0                      1                     0
    74                   0                      1                     0
    75                   0                      1                     0
    76                   0                      1                     0
    77                   0                      1                     0
    78                   0                      1                     0
    79                   0                      1                     0
    80                   0                      1                     0
    81                   0                      1                     0
    82                   0                      1                     0
    83                   0                      1                     0
    84                   0                      1                     0
    85                   0                      1                     0
    86                   0                      1                     0
    87                   0                      1                     0
    88                   0                      1                     0
    89                   0                      1                     0
    90                   0                      1                     0
    91                   0                      1                     0
    92                   0                      1                     0
    93                   0                      1                     0
    94                   0                      1                     0
    95                   0                      1                     0
    96                   0                      1                     0
    97                   0                      1                     0
    98                   0                      1                     0
    99                   0                      1                     0
    100                  0                      1                     0
    101                  0                      0                     1
    102                  0                      0                     1
    103                  0                      0                     1
    104                  0                      0                     1
    105                  0                      0                     1
    106                  0                      0                     1
    107                  0                      0                     1
    108                  0                      0                     1
    109                  0                      0                     1
    110                  0                      0                     1
    111                  0                      0                     1
    112                  0                      0                     1
    113                  0                      0                     1
    114                  0                      0                     1
    115                  0                      0                     1
    116                  0                      0                     1
    117                  0                      0                     1
    118                  0                      0                     1
    119                  0                      0                     1
    120                  0                      0                     1
    121                  0                      0                     1
    122                  0                      0                     1
    123                  0                      0                     1
    124                  0                      0                     1
    125                  0                      0                     1
    126                  0                      0                     1
    127                  0                      0                     1
    128                  0                      0                     1
    129                  0                      0                     1
    130                  0                      0                     1
    131                  0                      0                     1
    132                  0                      0                     1
    133                  0                      0                     1
    134                  0                      0                     1
    135                  0                      0                     1
    136                  0                      0                     1
    137                  0                      0                     1
    138                  0                      0                     1
    139                  0                      0                     1
    140                  0                      0                     1
    141                  0                      0                     1
    142                  0                      0                     1
    143                  0                      0                     1
    144                  0                      0                     1
    145                  0                      0                     1
    146                  0                      0                     1
    147                  0                      0                     1
    148                  0                      0                     1
    149                  0                      0                     1
    150                  0                      0                     1
    
    0 讨论(0)
  • 2020-11-21 12:12

    The simplest way to produce these dummy variables is something like the following:

    > print(year)
    [1] 1956 1957 1957 1958 1958 1959
    > dummy <- as.numeric(year == 1957)
    > print(dummy)
    [1] 0 1 1 0 0 0
    > dummy2 <- as.numeric(year >= 1957)
    > print(dummy2)
    [1] 0 1 1 1 1 1
    

    More generally, you can use ifelse to choose between two values depending on a condition. So if instead of a 0-1 dummy variable, for some reason you wanted to use, say, 4 and 7, you could use ifelse(year == 1957, 4, 7).

    0 讨论(0)
提交回复
热议问题