The 4th column is my desired column. Video,Webinar,Meeting,Conference are the 4 type of activities that the different customers(names) can engage in. You can see,in a given row,
Here you go:
setcolorder(dt, c("Name", "Webinar", "Meeting", "Conference", "Video", "NextStep"))
dt[, NextStepNew:=apply(dt, 1, function(x) paste0(names(x)[x==0], collapse=","))][]
Name Webinar Meeting Conference Video NextStep NextStepNew
1: John 0 0 0 1 Webinar,Meeting,Conference Webinar,Meeting,Conference
2: John 1 0 0 1 Meeting,Conference Meeting,Conference
3: John 1 1 0 1 Conference Conference
4: Tom 0 1 0 0 Webinar,Conference,Video Webinar,Conference,Video
5: Tom 0 1 1 0 Webinar,Video Webinar,Video
6: Kyle 0 0 1 0 Webinar,Meeting,Video Webinar,Meeting,Video
A possible solution:
DT[, nextstep := paste0(names(.SD)[.SD==0], collapse = ','), 1:nrow(DT), .SDcols = 2:5][]
which gives:
Name Video Webinar Meeting Conference nextstep
1: John 1 0 0 0 Webinar,Meeting,Conference
2: John 1 1 0 0 Meeting,Conference
3: John 1 1 1 0 Conference
4: Tom 0 0 1 0 Video,Webinar,Conference
5: Tom 0 0 1 1 Video,Webinar
6: Kyle 0 0 0 1 Video,Webinar,Meeting
When you want to order the names as you specified in the comments, you can do:
lvls <- c('Webinar', 'Meeting', 'Conference', 'Video')
DT[, nextstep := paste0(lvls[lvls %in% names(.SD)[.SD==0]], collapse = ','),
1:nrow(DT), .SDcols = 2:5][]
which gives:
Name Video Webinar Meeting Conference nextstep
1: John 1 0 0 0 Webinar,Meeting,Conference
2: John 1 1 0 0 Meeting,Conference
3: John 1 1 1 0 Conference
4: Tom 0 0 1 0 Webinar,Conference,Video
5: Tom 0 0 1 1 Webinar,Video
6: Kyle 0 0 0 1 Webinar,Meeting,Video
Instead of using paste0
(with collapse = ','
) you can also use toString
.
Used data:
DT <- fread('Name Video Webinar Meeting Conference
John 1 0 0 0
John 1 1 0 0
John 1 1 1 0
Tom 0 0 1 0
Tom 0 0 1 1
Kyle 0 0 0 1')
In case you are looking for a way to do this without simply re-ordering the columns in the order you want (in fact I see no reason why not to do so, but anyway..) you could try the following approach. It melt
s and updates by reference in a join:
lvls <- c("Webinar", "Meeting", "Conference", "Video") # make sure order is correct
dt[, row := .I] # add a row-identifier
dtm <- melt(dt, id.vars = c("Name", "row"), measure.vars = lvls) # melt to long format
# summarise dtm by using factor, sorting it and converting to strin; then join to dt
dt[dtm[value == 0, list(NextStep2 = toString(sort(factor(variable, levels = lvls)))),
by = row], NextStep2 := NextStep2, on = "row"][, row := NULL]
# Name Video Webinar Meeting Conference NextStep NextStep2
# 1: John 1 0 0 0 Webinar,Meeting,Conference Webinar, Meeting, Conference
# 2: John 1 1 0 0 Meeting,Conference Meeting, Conference
# 3: John 1 1 1 0 Conference Conference
# 4: Tom 0 0 1 0 Webinar,Conference,Video Webinar, Conference, Video
# 5: Tom 0 0 1 1 Webinar,Video Webinar, Video
# 6: Kyle 0 0 0 1 Webinar,Meeting,Video Webinar, Meeting, Video
If you want to paste all column names as in the data for those cases where there's no activity, you can add the following line to your code:
dt[rowSums(dt[, mget(lvls)]) == 0, NextStep2 := toString(names(dt)[2:5])]