dplyr | 易学教程

Merge 2 dataframes using conditions on “hour” and “min” of df1 in datetimes of df2

阅读更多关于 Merge 2 dataframes using conditions on “hour” and “min” of df1 in datetimes of df2

问题 I have a dataframe df.sample like this id <- c("A","A","A","A","A","A","A","A","A","A","A") date <- c("2018-11-12","2018-11-12","2018-11-12","2018-11-12","2018-11-12", "2018-11-12","2018-11-12","2018-11-14","2018-11-14","2018-11-14", "2018-11-12") hour <- c(8,8,9,9,13,13,16,6,7,19,7) min <- c(47,59,6,18,22,36,12,32,12,21,47) value <- c(70,70,86,86,86,74,81,77,79,83,91) df.sample <- data.frame(id,date,hour,min,value,stringsAsFactors = F) df.sample$date <- as.Date(df.sample$date,format="%Y-%m-

Merge 2 dataframes using conditions on “hour” and “min” of df1 in datetimes of df2

阅读更多关于 Merge 2 dataframes using conditions on “hour” and “min” of df1 in datetimes of df2

Merge 2 dataframes using conditions on “hour” and “min” of df1 in datetimes of df2

阅读更多关于 Merge 2 dataframes using conditions on “hour” and “min” of df1 in datetimes of df2

Using switch statement within dplyr's mutate

阅读更多关于 Using switch statement within dplyr's mutate

问题 I would like to use a switch statement within dplyr's mutate. I have a simple function that performs some operations and assigns alternative values via switch, for example: convert_am <- function(x) { x <- as.character(x) switch(x, "0" = FALSE, "1" = TRUE, NA) } This works as desired when applied to scalars: >> convert_am(1) [1] TRUE >> convert_am(2) [1] NA >> convert_am(0) [1] FALSE I would like to arrive at equivalent results via mutate call: mtcars %>% mutate(am = convert_am(am)) This

How to mutate for loop in dplyr

阅读更多关于 How to mutate for loop in dplyr

问题 I want to create multiple lag variables for a column in a data frame for a range of values. I have code that successfully does what I want but is not scalable for what I need (hundreds of iterations) I have code below that successfully does what I want but is not scalable for what I need (hundreds of iterations) Lake_Lag <- Lake_Champlain_long.term_monitoring_1992_2016 %>% group_by(StationID,Test) %>% arrange(StationID,Test,VisitDate) %>% mutate(lag.Result1 = dplyr::lag(Result, n = 1, default

Convert percentage columns with % into numeric in R

阅读更多关于 Convert percentage columns with % into numeric in R

问题 I have a small dataset as follows: id price month_pct year_pct 0 1 1.85 -2.63% -5.13% 1 2 2.42 0.00% 0.83% 2 3 1.81 0.00% -0.55% 3 4 4.37 -2.89% -5.62% 4 5 1.86 0.00% -7.92% 5 6 1.78 -1.11% -15.24% I would like to convert month_pct and year_pct (which are factor type) into numeric then multiply by 100 . How could I do that in R? Thanks. id price month_pct year_pct 0 1 1.85 -2.63 -5.13 1 2 2.42 0.00 0.83 2 3 1.81 0.00 -0.55 3 4 4.37 -2.89 -5.62 4 5 1.86 0.00 -7.92 5 6 1.78 -1.11 -15.24 Code

add column to my data frame listing columns with the highest row value

阅读更多关于 add column to my data frame listing columns with the highest row value

问题 trying tell r to read through the rows of my dataframe and add the column with the highest value in the row to a new column in the dataframe called "MOST_COMMON_CANCER" I tried the following code but got an error. BASE_DF2 <- BASE_DF2%>%mutate(MOST_COMMON_CANCER=colnames(BASE_DF2[8:26])[max.col(BASE_DF2[8:26],ties.method="first")],.keep="all",.after=c_INCS_RATE) Error: Problem with `mutate()` input `MOST_COMMON_CANCER`. x Input `MOST_COMMON_CANCER` can't be recycled to size 1. i Input `MOST

Double left join in dplyr to recover values

阅读更多关于 Double left join in dplyr to recover values

问题 I've checked this issue but couldn't find a matching entry. Say you have 2 DFs: df1:mode df2:sex 1 1 2 2 3 And a DF3 where most of the combinations are not present, e.g. mode | sex | cases 1 1 9 1 1 2 2 2 7 3 1 2 1 2 5 and you want to summarise it with dplyr obtaining all combinations (with not existent ones=0): mode | sex | cases 1 1 11 1 2 5 2 1 0 2 2 7 3 1 2 3 2 0 If you do a single left_join (left_join(df1,df3) you recover the modes not in df3, but 'Sex' appears as 'NA', and the same if

How do I get mean functions to work when I use piping?

阅读更多关于 How do I get mean functions to work when I use piping?

问题 This is probably a simple question, but I'm having trouble getting the mean function to work using dplyr. Using the mtcars dataset as an example, if I type: data(mtcars) mtcars %>% select (mpg) %>% mean() I get the "Warning message: In mean.default(.) : argument is not numeric or logical: returning NA" error message. For some reason though if I repeat the same code but just ask for a "summary", or "range" or several other statistical calculations, they work fine: data(mtcars) mtcars %>%

R filter rows based on multiple partial strings applied to multiple columns

阅读更多关于 R filter rows based on multiple partial strings applied to multiple columns

问题 Sample of dataset: diag01 <- as.factor(c("S7211","J47","J47","K729","M2445","Z509","Z488","R13","L893","N318","L0311","S510","A047","D649")) diag02 <- as.factor(c("K590","D761","J961","T501","M8580","R268","T831","G8240","B9688","G550","E162","T8902","E86","I849")) diag03 <- as.factor(c("F058","M0820","E877","E86","G712","R32","A408","E888","G8220","C794","T68","L0310","M1094","D469")) diag04 <- as.factor(c("E86","C845","R790","I420","G4732","R600","L893","R509","T913","C795","M8412","G8212",