Label Encoder functionality in R?

前端 未结 9 921
你的背包
你的背包 2021-02-06 08:56

In python, scikit has a great function called LabelEncoder that maps categorical levels (strings) to integer representation.

Is there anything in R to do this?

9条回答
  •  借酒劲吻你
    2021-02-06 09:14

    # Data
    Country <- c("France", "Spain", "Germany", "Spain", "Germany", "France")
    Age <- c(34, 27, 30, 32, 42, 30)
    Purchased <- c("No", "Yes", "No", "No", "Yes", "Yes")
    df <- data.frame(Country, Age, Purchased)
    df
    
    # Output
      Country Age Purchased
    1  France  34        No
    2   Spain  27       Yes
    3 Germany  30        No
    4   Spain  32        No
    5 Germany  42       Yes
    6  France  30       Yes
    

    Using CatEncoders package : Encoders for Categorical Variables

    library(CatEncoders)
    
    # Saving names of categorical variables
    factors <- names(which(sapply(df, is.factor)))
    
    # Label Encoder
    for (i in factors){
      encode <- LabelEncoder.fit(df[, i])
      df[, i] <- transform(encode, df[, i])
    }
    df
    
    # Output
      Country Age Purchased
    1       1  34         1
    2       3  27         2
    3       2  30         1
    4       3  32         1
    5       2  42         2
    6       1  30         2
    

    Using R base : factor function

    # Label Encoder
    levels <- c("France", "Spain", "Germany", "No", "Yes")
    labels <- c(1, 2, 3, 1, 2)
    for (i in factors){
      df[, i] <- factor(df[, i], levels = levels, labels = labels, ordered = TRUE)
    }
    df
    
    # Output
      Country Age Purchased
    1       1  34         1
    2       2  27         2
    3       3  30         1
    4       2  32         1
    5       3  42         2
    6       1  30         2
    

提交回复
热议问题