converting multiple columns from wide to long using pivot_longer

我怕爱的太早我们不能终老 提交于 2021-02-08 11:47:19

问题


I get an error message when I want to convert multiple columns from wide to long with pivot_longer

I have code which converts from wide to long with gather but I have to do this column by column. I want to use pivot_longer to gather multiple columns rather than column by column.

This is some input data:

structure(list(id = c("81", "83", "85", "88", "1", "2"), look_work = c("yes", 
"yes", "yes", "yes", "yes", "yes"), current_work = c("no", "yes", 
"no", "no", "no", "no"), before_work = c("no", "NULL", "yes", 
"yes", "yes", "yes"), keen_move = c("yes", "yes", "no", "no", 
"no", "no"), city_size = c("village", "more than 500k inhabitants", 
"more than 500k inhabitants", "village", "city up to 20k inhabitants", 
"100k - 199k inhabitants"), gender = c("male", "female", "female", 
"male", "female", "male"), age = c("18 - 24 years", "18 - 24 years", 
"more than 50 years", "18 - 24 years", "31 - 40 years", "more than 50 years"
), education = c("secondary", "vocational", "secondary", "secondary", 
"secondary", "secondary"), hf1 = c("", "", "", "1", "1", "1"), 
    hf2 = c("", "1", "1", "", "", ""), hf3 = c("", "", "", "", 
    "", ""), hf4 = c("", "", "", "", "", ""), hf5 = c("", "", 
    "", "", "", ""), hf6 = c("", "", "", "", "", ""), ac1 = c("", 
    "", "", "", "", "1"), ac2 = c("", "1", "1", "", "1", ""), 
    ac3 = c("", "", "", "", "1", ""), ac4 = c("", "", "", "", 
    "", ""), ac5 = c("", "", "", "", "", ""), ac6 = c("", "", 
    "", "", "", ""), cs1 = c("", "", "", "", "", ""), cs2 = c("", 
    "1", "1", "", "1", ""), cs3 = c("", "", "", "", "", "1"), 
    cs4 = c("", "", "", "1", "", ""), cs5 = c("", "", "", "", 
    "", ""), cs6 = c("", "", "", "", "", ""), cs7 = c("", "", 
    "", "", "", ""), cs8 = c("", "", "", "", "", ""), se1 = c("", 
    "", "1", "1", "", ""), se2 = c("", "", "", "", "1", ""), 
    se3 = c("", "1", "", "", "1", "1"), se4 = c("", "", "", "", 
    "", ""), se5 = c("", "", "", "", "", ""), se6 = c("", "", 
    "", "", "", ""), se7 = c("", "", "", "", "", ""), se8 = c("", 
    "", "", "1", "", "")), row.names = c(NA, 6L), class = "data.frame")

The code using gather is:

df1 <- df %>%
  gather(key = "hf_com", value = "hf_com_freq", hf_<:hf6) %>%
  gather(key = "ac_com", value = "ac_com_freq", ac1:ac6) %>%
  filter(substring(hf_com, 3) == substring(ac_com, 3))

df1 <- df1 %>%
  gather(key = "curr_sal", value = "curr_sal_freq", cs1:cs8) %>%
  gather(key = "exp_sal", value = "exp_sal_freq", se1:se8) %>%
  filter(substring(curr_sal, 3) == substring(exp_sal, 3))

The code using pivot_longer is:

df_longer <- df %>% 
  pivot_longer(
    cols = starts_with("hf"), 
    names_to = "hf_com", 
    values_to = "hf_freq",
    names_prefix = "hf",
    na.rm = TRUE)

The expected results which I get with gather are:

structure(list(id = c("81", "83", "85", "88", "1", "2"), look_work = c("yes", 
"yes", "yes", "yes", "yes", "yes"), current_work = c("no", "yes", 
"no", "no", "no", "no"), before_work = c("no", "NULL", "yes", 
"yes", "yes", "yes"), keen_move = c("yes", "yes", "no", "no", 
"no", "no"), city_size = c("village", "more than 500k inhabitants", 
"more than 500k inhabitants", "village", "city up to 20k inhabitants", 
"100k - 199k inhabitants"), gender = c("male", "female", "female", 
"male", "female", "male"), age = c("18 - 24 years", "18 - 24 years", 
"more than 50 years", "18 - 24 years", "31 - 40 years", "more than 50 years"
), education = c("secondary", "vocational", "secondary", "secondary", 
"secondary", "secondary"), hf_com = c("hf1", "hf1", "hf1", "hf1", 
"hf1", "hf1"), hf_com_freq = c("", "", "", "1", "1", "1"), ac_com = c("ac1", 
"ac1", "ac1", "ac1", "ac1", "ac1"), ac_com_freq = c("", "", "", 
"", "", "1"), curr_sal = c("cs1", "cs1", "cs1", "cs1", "cs1", 
"cs1"), curr_sal_freq = c("", "", "", "", "", ""), exp_sal = c("se1", 
"se1", "se1", "se1", "se1", "se1"), exp_sal_freq = c("", "", 
"1", "1", "", "")), row.names = c(NA, 6L), class = "data.frame")

With pivot_longer, I get the following error message:

Error in pivot_longer(., cols = starts_with("hf"), names_to = "hf_com",  : 
  unused argument (na.rm = TRUE)

Also, if there is no solution with pivot_longer, then a solution with data.table would be appreciated.


回答1:


I have solved the problem:

This needs to be changed from:

df_longer <- df %>% 
  pivot_longer(
    cols = starts_with("hf"), 
    names_to = "hf_com", 
    values_to = "hf_freq",
    names_prefix = "hf",
    na.rm = TRUE)

to:

df_longer <- df %>% 
  pivot_longer(
    cols = starts_with("hf"), 
    names_to = "hf_com", 
    values_to = "hf_freq",
    names_prefix = "hf",
    values_drop_na = TRUE)


来源:https://stackoverflow.com/questions/58115721/converting-multiple-columns-from-wide-to-long-using-pivot-longer

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!