R: How to split a string into values and map the resultant broken pieces as columns to the dataset? [duplicate]

断了今生、忘了曾经 提交于 2019-12-13 09:18:05

问题


As shown in the above pic, I've a column, genres, with a list of genres the corresponding movie belongs to. There are in total 19 unique genres. I'd like to know if I can manipulate this data into appending 19 columns to the data set each corresponding to each of the genres identifiers and label the corresponding cells as 0 or 1 indicating the movies affiliation to the each genre columns.

It should look something like below picture.


回答1:


We can do this after splitting the 'genres' column

library(qdapTools)
d1 <- mtabulate(strsplit(as.character(df1$genres),","))
row.names(d1) <- sub("\\s*\\(.*", "", df1$title)

Or another option is to create a matrix with column names as 'genres' and then do a comparison on the splitted string

m1 <- matrix(0, dimnames = list(sub("\\s*\\(.*", "", df1$title), 
      c("Adventure", "Animation", "Children",
   "Comedy", "Fantasy", "Romance", "Action", "Crime", "Thriller")), ncol=9, nrow = nrow(df1))
m1 + (t(sapply(strsplit(as.character(df1$genres), ","), function(x) colnames(m1) %in% x)))
#         Adventure Animation Children Comedy Fantasy Romance Action Crime Thriller
#Toy Story         1         1        1      1       1       0      0     0        0
#Jumanji           1         0        1      0       1       0      0     0        0
#Heat              0         0        0      0       0       0      1     1        1


来源:https://stackoverflow.com/questions/42616425/r-how-to-split-a-string-into-values-and-map-the-resultant-broken-pieces-as-colu

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!