dplyr | 易学教程

R calculating grouped frequency table with percentage [duplicate]

阅读更多关于 R calculating grouped frequency table with percentage [duplicate]

问题 This question already has answers here : Calculate Percentage for each time series observations per Group in R (2 answers) Closed 4 years ago . Giving the following data.frame , I would like to calculate the occurance of each variable of VAR and the percentage of these occurence by the grouping variable GROUP : GROUP<-c("G1","G2","G1","G2","G3","G3","G1") VAR<-c("A","B","B","A","B","B","A") d<-data.frame(GROUP,VAR) With table() , I get a nice frequency table, counting the occurences of all

R user-defined/dynamic summary function within dplyr::summarise

阅读更多关于 R user-defined/dynamic summary function within dplyr::summarise

问题 Somewhat hard to define this question without sounding like lots of similar questions! I have a function for which I want one of the parameters to be a function name, that will be passed to dplyr::summarise, e.g. "mean" or "sum": data(mtcars) f <- function(x = mtcars, groupcol = "cyl", zCol = "disp", zFun = "mean") { zColquo = quo_name(zCol) cellSummaries <- x %>% group_by(gear, !!sym(groupcol)) %>% # 1 preset grouper, 1 user-defined summarise(Count = n(), # 1 preset summary, 1 user defined !

Remove prefix letter from column variables

阅读更多关于 Remove prefix letter from column variables

问题 I have all column names that start with ' m '. Example: mIncome, mAge . I want to remove the prefix . So far, I have tried the following: df %>% rename_all(~stringr::str_replace_all(.,"m","")) This removes all the column names that has the letter ' m '. I just need it removed from from the start. Any suggestions? 回答1: We need to specify the location. The ^ matches the start of the string (or here the column name). So, if we use ^m , it will only match 'm' at the beginning or start of the

Rolling window slider::slide() with grouped data

阅读更多关于 Rolling window slider::slide() with grouped data

问题 In the following example I try to compute the first coefficient from a linear model for time t = 1 until t. It's an expanding rolling window. It works well with ungrouped data, but when grouped by case, I get the error Error: Column coef1 must be length 10 (the group size) or one, not 30 . How can I handle grouped data? library(dplyr) library(slider) get_coef1 <- function(data) { coef1 <- lm(data = data, r1 ~ r2 + r3) %>% coef() %>% .["r2"] %>% unname() return(coef1) } data <- tibble(t = rep

Rolling window slider::slide() with grouped data

阅读更多关于 Rolling window slider::slide() with grouped data

How to correct list of mispellings at once in R

阅读更多关于 How to correct list of mispellings at once in R

问题 I have a whole list of misspelling and I would like to change the all in one go. Is there an easy way to do so without writing a massive ifelse statement? vegas <- c("North Las Vegas","N Las Vegas", "LAS VEGAS", "Las vegas","N. Las Vegas", "las vegas", "Las Vegas", "Las Vegas ", "South Las Vegas", "La Vegas", "Las Vegas, NV", "LasVegas", "110 Las Vegas", "C Las Vegas", "Henderson and Las vegas", "las Vegas", "Las Vegas & Henderson", "Las Vegas East", "Las Vegas Nevada", "Las Vegas NV", "Las

How to correct list of mispellings at once in R

阅读更多关于 How to correct list of mispellings at once in R

How to reshape a wider data.frame to longer data.frame in R? [duplicate]

阅读更多关于 How to reshape a wider data.frame to longer data.frame in R? [duplicate]

问题 This question already has answers here : Transpose and Merge columns in R [duplicate] (3 answers) Reshaping data.frame from wide to long format (9 answers) Closed 7 months ago . I was playing with pivot_longer and pivot_wider but probably am missing something. I have a data.frame like D_Wider and would like to convert it to something like D_longer . any way forward? library(tidyverse) D_Wider <- data.frame(A = 15, S = 10, D = 25, Z = 16) Desired Output D_Longer <- data.frame(Stations = c("A",

How to combine the across () function with mutate () and case_when () to mutate values in multiple columns according to a condition?

阅读更多关于 How to combine the across () function with mutate () and case_when () to mutate values in multiple columns according to a condition?

问题 I have demographic data set, which includes the age of people in a household. This is collected via a survey and participants are allowed to refuse providing their age. The result is a data set with one household per row (each with a household ID code), and various household characteristics such as age in the columns. Refused responses as coded as "R", and you could re-create a sample using the code below: df <- list(Household_ID = c("1A", "1B", "1C", "1D", "1E"), AGE1 = c("25", "47", "39",

How to combine the across () function with mutate () and case_when () to mutate values in multiple columns according to a condition?

阅读更多关于 How to combine the across () function with mutate () and case_when () to mutate values in multiple columns according to a condition?