Label Encoder functionality in R?

前端未结

关注

 9  957

你的背包 2021-02-06 08:56

In python, scikit has a great function called LabelEncoder that maps categorical levels (strings) to integer representation.

Is there anything in R to do this?

9条回答

借酒劲吻你 (楼主)

2021-02-06 09:14

# Data
Country <- c("France", "Spain", "Germany", "Spain", "Germany", "France")
Age <- c(34, 27, 30, 32, 42, 30)
Purchased <- c("No", "Yes", "No", "No", "Yes", "Yes")
df <- data.frame(Country, Age, Purchased)
df

# Output
  Country Age Purchased
1  France  34        No
2   Spain  27       Yes
3 Germany  30        No
4   Spain  32        No
5 Germany  42       Yes
6  France  30       Yes

Using CatEncoders package : Encoders for Categorical Variables

library(CatEncoders)

# Saving names of categorical variables
factors <- names(which(sapply(df, is.factor)))

# Label Encoder
for (i in factors){
  encode <- LabelEncoder.fit(df[, i])
  df[, i] <- transform(encode, df[, i])
}
df

# Output
  Country Age Purchased
1       1  34         1
2       3  27         2
3       2  30         1
4       3  32         1
5       2  42         2
6       1  30         2

Using R base : factor function

# Label Encoder
levels <- c("France", "Spain", "Germany", "No", "Yes")
labels <- c(1, 2, 3, 1, 2)
for (i in factors){
  df[, i] <- factor(df[, i], levels = levels, labels = labels, ordered = TRUE)
}
df

# Output
  Country Age Purchased
1       1  34         1
2       2  27         2
3       3  30         1
4       2  32         1
5       3  42         2
6       1  30         2

0 讨论(0)

查看其它9个回答