I have five columns[each column name represents each candidate say..
can1 can2 can3 can4 can5
, each column has binary data(TRUE OR FALSE) and
You could also use a simple mapply
for this:
df$new_colmn <-
mapply(function(x,y) {
df[x,y]
},
1:nrow(df), #row number
df$CANDIDATES) #corresponding candidates column
Essentially for each row (x argument) you return the corresponding candidates column (y argument).
Ouput:
> df
can1 can2 can3 can4 can5 CANDIDATES new_colmn
1 TRUE TRUE FALSE TRUE FALSE can2 TRUE
2 FALSE TRUE FALSE FALSE FALSE can4 FALSE
3 FALSE TRUE TRUE FALSE FALSE can2 TRUE
4 TRUE TRUE FALSE FALSE TRUE can1 TRUE
We can use matrix indexing to create the new column:
df$new_column <- df[-ncol(df)][cbind(1:nrow(df), match(df$CANDIDATES, names(df)))]
Explanation
The function call, match(df$CANDIDATES, names(df)
, is a way to match the CANDIDATES column to the other column names. And 1:nrow(df)
simply outputs a sequence from 1 to the last row number. Together we get:
cbind(1:nrow(df), match(df$CANDIDATES, names(df)))
[,1] [,2]
[1,] 1 2
[2,] 2 4
[3,] 3 2
[4,] 4 1
This is a series of row, column combinations. One strength of R is the ability to subset a data frame with a two-column matrix. The first column will represent the row index, and the second column indicates the column index.
The matrix subsetting will coerce to matrix and that's okay if our input is of all the same type. That is why we subset the data frame to only the logical columns df[-ncol(df)]
. That way no type conversion will occur.
Result:
df
can1 can2 can3 can4 can5 CANDIDATES new_column
1 TRUE TRUE FASLE TRUE FALSE can2 TRUE
2 FALSE TRUE FALSE FALSE FALSE can4 FALSE
3 FALSE TRUE TRUE FALSE FALSE can2 TRUE
4 TRUE TRUE FALSE FALSE TRUE can1 TRUE