dplyr

R calculating grouped frequency table with percentage [duplicate]

天大地大妈咪最大 提交于 2021-02-11 12:29:29
问题 This question already has answers here : Calculate Percentage for each time series observations per Group in R (2 answers) Closed 4 years ago . Giving the following data.frame , I would like to calculate the occurance of each variable of VAR and the percentage of these occurence by the grouping variable GROUP : GROUP<-c("G1","G2","G1","G2","G3","G3","G1") VAR<-c("A","B","B","A","B","B","A") d<-data.frame(GROUP,VAR) With table() , I get a nice frequency table, counting the occurences of all

R user-defined/dynamic summary function within dplyr::summarise

|▌冷眼眸甩不掉的悲伤 提交于 2021-02-11 12:14:29
问题 Somewhat hard to define this question without sounding like lots of similar questions! I have a function for which I want one of the parameters to be a function name, that will be passed to dplyr::summarise, e.g. "mean" or "sum": data(mtcars) f <- function(x = mtcars, groupcol = "cyl", zCol = "disp", zFun = "mean") { zColquo = quo_name(zCol) cellSummaries <- x %>% group_by(gear, !!sym(groupcol)) %>% # 1 preset grouper, 1 user-defined summarise(Count = n(), # 1 preset summary, 1 user defined !

Remove prefix letter from column variables

女生的网名这么多〃 提交于 2021-02-11 07:53:31
问题 I have all column names that start with ' m '. Example: mIncome, mAge . I want to remove the prefix . So far, I have tried the following: df %>% rename_all(~stringr::str_replace_all(.,"m","")) This removes all the column names that has the letter ' m '. I just need it removed from from the start. Any suggestions? 回答1: We need to specify the location. The ^ matches the start of the string (or here the column name). So, if we use ^m , it will only match 'm' at the beginning or start of the

Rolling window slider::slide() with grouped data

风格不统一 提交于 2021-02-11 07:44:12
问题 In the following example I try to compute the first coefficient from a linear model for time t = 1 until t. It's an expanding rolling window. It works well with ungrouped data, but when grouped by case, I get the error Error: Column coef1 must be length 10 (the group size) or one, not 30 . How can I handle grouped data? library(dplyr) library(slider) get_coef1 <- function(data) { coef1 <- lm(data = data, r1 ~ r2 + r3) %>% coef() %>% .["r2"] %>% unname() return(coef1) } data <- tibble(t = rep

Rolling window slider::slide() with grouped data

青春壹個敷衍的年華 提交于 2021-02-11 07:43:48
问题 In the following example I try to compute the first coefficient from a linear model for time t = 1 until t. It's an expanding rolling window. It works well with ungrouped data, but when grouped by case, I get the error Error: Column coef1 must be length 10 (the group size) or one, not 30 . How can I handle grouped data? library(dplyr) library(slider) get_coef1 <- function(data) { coef1 <- lm(data = data, r1 ~ r2 + r3) %>% coef() %>% .["r2"] %>% unname() return(coef1) } data <- tibble(t = rep

How to correct list of mispellings at once in R

旧时模样 提交于 2021-02-11 05:29:32
问题 I have a whole list of misspelling and I would like to change the all in one go. Is there an easy way to do so without writing a massive ifelse statement? vegas <- c("North Las Vegas","N Las Vegas", "LAS VEGAS", "Las vegas","N. Las Vegas", "las vegas", "Las Vegas", "Las Vegas ", "South Las Vegas", "La Vegas", "Las Vegas, NV", "LasVegas", "110 Las Vegas", "C Las Vegas", "Henderson and Las vegas", "las Vegas", "Las Vegas & Henderson", "Las Vegas East", "Las Vegas Nevada", "Las Vegas NV", "Las

How to correct list of mispellings at once in R

青春壹個敷衍的年華 提交于 2021-02-11 05:28:02
问题 I have a whole list of misspelling and I would like to change the all in one go. Is there an easy way to do so without writing a massive ifelse statement? vegas <- c("North Las Vegas","N Las Vegas", "LAS VEGAS", "Las vegas","N. Las Vegas", "las vegas", "Las Vegas", "Las Vegas ", "South Las Vegas", "La Vegas", "Las Vegas, NV", "LasVegas", "110 Las Vegas", "C Las Vegas", "Henderson and Las vegas", "las Vegas", "Las Vegas & Henderson", "Las Vegas East", "Las Vegas Nevada", "Las Vegas NV", "Las

How to reshape a wider data.frame to longer data.frame in R? [duplicate]

半世苍凉 提交于 2021-02-10 22:15:51
问题 This question already has answers here : Transpose and Merge columns in R [duplicate] (3 answers) Reshaping data.frame from wide to long format (9 answers) Closed 7 months ago . I was playing with pivot_longer and pivot_wider but probably am missing something. I have a data.frame like D_Wider and would like to convert it to something like D_longer . any way forward? library(tidyverse) D_Wider <- data.frame(A = 15, S = 10, D = 25, Z = 16) Desired Output D_Longer <- data.frame(Stations = c("A",

How to combine the across () function with mutate () and case_when () to mutate values in multiple columns according to a condition?

て烟熏妆下的殇ゞ 提交于 2021-02-10 20:51:26
问题 I have demographic data set, which includes the age of people in a household. This is collected via a survey and participants are allowed to refuse providing their age. The result is a data set with one household per row (each with a household ID code), and various household characteristics such as age in the columns. Refused responses as coded as "R", and you could re-create a sample using the code below: df <- list(Household_ID = c("1A", "1B", "1C", "1D", "1E"), AGE1 = c("25", "47", "39",

How to combine the across () function with mutate () and case_when () to mutate values in multiple columns according to a condition?

荒凉一梦 提交于 2021-02-10 20:48:37
问题 I have demographic data set, which includes the age of people in a household. This is collected via a survey and participants are allowed to refuse providing their age. The result is a data set with one household per row (each with a household ID code), and various household characteristics such as age in the columns. Refused responses as coded as "R", and you could re-create a sample using the code below: df <- list(Household_ID = c("1A", "1B", "1C", "1D", "1E"), AGE1 = c("25", "47", "39",