plyr

Is there a way to recode multiple variables at once?

若如初见. 提交于 2020-12-12 17:45:06
问题 I have a dataset of students' report card marks that range from D- to A+. I'd like to recode them into scale of 1-12 (i.e. D- = 1, D = 2 ... A = 11, A+ = 12). Right now I'm suing the revalue function in plyr . I have several columns that I'd like to recode - is there a shorter way to do this than running revalue on each column? Some data: student <- c("StudentA","StudentB","StudentC","StudentD","StudentE","StudentF","StudentG","StudentH","StudentI","StudentJ") read <- c("A", "A+", "B", "B-",

Is there a way to recode multiple variables at once?

无人久伴 提交于 2020-12-12 17:42:59
问题 I have a dataset of students' report card marks that range from D- to A+. I'd like to recode them into scale of 1-12 (i.e. D- = 1, D = 2 ... A = 11, A+ = 12). Right now I'm suing the revalue function in plyr . I have several columns that I'd like to recode - is there a shorter way to do this than running revalue on each column? Some data: student <- c("StudentA","StudentB","StudentC","StudentD","StudentE","StudentF","StudentG","StudentH","StudentI","StudentJ") read <- c("A", "A+", "B", "B-",

With min() in R return NA instead of Inf

你。 提交于 2020-08-27 20:56:05
问题 Please consider the following: I recently 'discovered' the awesome plyr and dplyr packages and use those for analysing patient data that is available to me in a data frame. Such a data frame could look like this: df <- data.frame(id = c(1, 1, 1, 2, 2), # patient ID diag = c(rep("dia1", 3), rep("dia2", 2)), # diagnosis age = c(7.8, NA, 7.9, NA, NA)) # patient age I would like to summarise the minimum patient age of all patients with a median and mean. I did the following: min.age <- df %>%

Is there an alternative to “revalue” function from plyr when using dplyr?

跟風遠走 提交于 2020-08-22 08:24:07
问题 I'm a fan of the revalue function is plyr for substituting strings. It's simple and easy to remember. However, I've migrated new code to dplyr which doesn't appear to have a revalue function. What is the accepted idiom in dplyr for doing things previously done with revalue ? 回答1: There is a recode function available starting with dplyr version dplyr_0.5.0 which looks very similar to revalue from plyr . Example built from the recode documentation Examples section: set.seed(16) x = sample(c("a"

分组功能(tapply,by,aggregate)和* apply系列

自闭症网瘾萝莉.ら 提交于 2020-08-11 20:13:47
问题: Whenever I want to do something "map"py in R, I usually try to use a function in the apply family. 每当我想在R中做“ map” py任务时,我通常都会尝试在 apply 系列中使用一个函数。 However, I've never quite understood the differences between them -- how { sapply , lapply , etc.} apply the function to the input/grouped input, what the output will look like, or even what the input can be -- so I often just go through them all until I get what I want. 但是,我从未完全理解它们之间的区别-{ sapply , lapply 等}如何将函数应用于输入/分组输入,输出将是什么样,甚至输入是什么-所以我经常只是遍历所有这些,直到得到想要的东西。 Can someone explain how to use which one when? 谁能解释什么时候使用哪一个? My current (probably

How do you aggregate rows to a factor variable with three levels?

老子叫甜甜 提交于 2020-07-22 21:33:11
问题 I have a dataset where some participants have multiple rows and I need to aggregate the data in a way that every participant has only one row. The dataset contains different variable types (e.g., factors, date, age etc.) I have made a code that works and looks like this: example4 <- SMARTdata_50j_diagc_2016 %>% group_by( Patient_Id ) %>% summarise( Groep = first( Groep ), Ziekenhuis_Nr = first( Ziekenhuis_Nr ), Ziekenhuistype = first( Ziekenhuistype ), aantalDBC = n(), aantalVervolg = sum( as

ddply multiple quantiles by group

人盡茶涼 提交于 2020-06-07 12:11:14
问题 how can I do this calculation: library(ddply) quantile(baseball$ab) 0% 25% 50% 75% 100% 0 25 131 435 705 by groups, say by "team"? I want a data.frame with rownames "team" and column names "0% 25% 50% 75% 100%", i.e. one quantile call per group. doing ddply(baseball,"team",quantile(ab)) is not the correct solution. my problem is that the OUTPUT of each grouped operation is a vector of length 5 here. in other words, what's a neat solution to this (nevermind the header): m=data.frame() for (i