purrr pmap to read max column name by column name number

北战南征 提交于 2019-12-13 02:49:35

问题


I have this dataset:

library(dpylr)
Problem<- tibble(name = c("Angela", "Claire", "Justin", "Bob", "Gil"),
                   status_1 = c("Registered", "No Action", "Completed", "Denied", "No Action"),
                   status_2 = c("Withdrawn", "No Action", "Registered", "No Action", "Exempt"),
                   status_3 = c("No Action", "Registered", "Withdrawn", "No Action", "No Action"))

I want to make a column that has everyone's current status. If the person has ever completed the course, they are completed. If they were ever exempt, they are excluded. If they are anything else other than registered (or completed or exempt), they are "Not Taken." What's hard is that I want my code to say they were registered ONLY if their last action was being registered. So, it should look like this:

library(dplyr)
solution <- tibble(name = c("Angela", "Claire", "Justin", "Bob", "Gil"),
                   status_1 = c("Registered", "No Action", "Completed", "Denied", "No Action"),
                   status_2 = c("Withdrawn", "No Action", "Registered", "No Action", "Exempt"),
                   status_3 = c("No Action", "Registered", "Withdrawn", "No Action", "No Action"),
                   current = c("Not Taken", "Registered", "Completed", "Not Taken", "Exempt")

I have this code, and the part that won't work is the which.max() line:

library(dplyr)
library(purrr)
library(stringr)
problem %>% 
  mutate(
    current =
      pmap_chr(select(., contains("status")), ~
        case_when(
          any(str_detect(c(...), "(?i)Completed")) ~ "Completed",
          any(str_detect(c(...), "(?i)Exempt")) | any(str_detect(c(...), "(?i)Incomplete")) ~ "Exclude",
          which.max(parse_number(colnames(.)) == "Registered") ~ "Registered",
          any(str_detect(c(...), "(?i)No Show")) | any(str_detect(c(...), "(?i)Denied")) | any(str_detect(c(...), "(?i)Cancelled")) | any(str_detect(c(...), "(?i)Waitlist Expired")) || any(str_detect(c(...), "(?i)Withdrawn")) ~ "Not Taken",
          TRUE ~ "NA"
        )
      )
  )

I've tried every way for R to read the numbers of status, but I can't figure it out. It's important that I keep the rest of the code, especially the str_detect() portion because, while my sample data is clean, the real dataset has many rows of status and many entries that look like "COMPLETED" and "completed."

Why can I not look at purrr with parse number to have it read the max status?

Thank you!


回答1:


Keeping everything as it is and dealing only with your which.max issue, we can do

library(tidyverse)

Problem %>% 
    mutate(
       current =
         pmap_chr(select(., contains("status")), ~
             case_when(
               any(str_detect(c(...), "(?i)Completed")) ~ "Completed",
               any(str_detect(c(...), "(?i)Exempt")) | any(str_detect(c(...), "(?i)Incomplete")) ~ "Exclude",
               which.max(c(...) == "Registered") == length(c(...)) ~ "Registered",
               any(str_detect(c(...), "(?i)No Show")) | any(str_detect(c(...), "(?i)Denied")) | any(str_detect(c(...), "(?i)Cancelled")) | any(str_detect(c(...), "(?i)Waitlist Expired")) || any(str_detect(c(...), "(?i)Withdrawn")) ~ "Not Taken",
               TRUE ~ "NA"
             )
            )
       )

# name   status_1   status_2   status_3   current   
#  <chr>  <chr>      <chr>      <chr>      <chr>     
#1 Angela Registered Withdrawn  No Action  Not Taken 
#2 Claire No Action  No Action  Registered Registered
#3 Justin Completed  Registered Withdrawn  Completed 
#4 Bob    Denied     No Action  No Action  Not Taken 
#5 Gil    No Action  Exempt     No Action  Exempt  


来源:https://stackoverflow.com/questions/56632993/purrr-pmap-to-read-max-column-name-by-column-name-number

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!