问题
I have data obtained from a survey that lists the recipient's name and whether or not they selected a specific county in the state. The survey structure outputs an off for any county not selected and an for the selected county. The state has about 100 counties so there end up being a lot of columns that really correspond to the same question. What I am looking to do is replace any cells with on with the county name and any cells with off with a blank. From there I can basically unite many columns into one without much difficulty. Below I have recreated a brief example data set
name <- c("Recipient", "AB", "BC", "DF", "EF", "WE")
Q1 <- c("County1", "Off", "On", "On", "Off", "Off")
Q2 <- c("County2", "On", "Off", "Off", "Off", "Off")
Q3 <- c("County3", "Off", "Off", "Off", "On", "On")
dt <- data.frame(name, Q1, Q2, Q3)
> dt
name Q1 Q2 Q3
1 Recipient County1 County2 County3
2 AB Off On Off
3 BC On Off Off
4 DF On Off Off
5 EF Off Off On
6 WE Off Off On
I am looking for a desired output of
name Q1 Q2 Q3
1 Recipient County1 County2 County3
1 AB County2
2 BC County1
3 DF County1
4 EF County3
5 WE County3
I am not sure how to go about this and designate that the first row be used to fill cells.
Thanks for any help.
回答1:
We create a logical vector and assign the first row values based on the logical vector
i1 <- dt[-1] == 'On'
dt[-1][i1] <- unlist(dt[1, -1])[col(dt[-1])][i1]
dt[-1][!i1] <- ""
dt
# name Q1 Q2 Q3
#1 Recipient
#2 AB County 2
#3 BC County 1
#4 DF County 1
#5 EF County 3
#6 WE County 3
Or with dplyr
library(dplyr)
dt %>%
mutate_at(vars(starts_with('Q')), ~ case_when(. == 'On' ~first(.), TRUE ~ ''))
data
dt <- data.frame(name, Q1, Q2, Q3, stringsAsFactors = FALSE)
来源:https://stackoverflow.com/questions/59669075/replace-values-in-a-column-with-specific-row-value-from-same-column-using-loop