dplyr

Merge 2 dataframes using conditions on “hour” and “min” of df1 in datetimes of df2

折月煮酒 提交于 2021-02-16 20:07:05
问题 I have a dataframe df.sample like this id <- c("A","A","A","A","A","A","A","A","A","A","A") date <- c("2018-11-12","2018-11-12","2018-11-12","2018-11-12","2018-11-12", "2018-11-12","2018-11-12","2018-11-14","2018-11-14","2018-11-14", "2018-11-12") hour <- c(8,8,9,9,13,13,16,6,7,19,7) min <- c(47,59,6,18,22,36,12,32,12,21,47) value <- c(70,70,86,86,86,74,81,77,79,83,91) df.sample <- data.frame(id,date,hour,min,value,stringsAsFactors = F) df.sample$date <- as.Date(df.sample$date,format="%Y-%m-

Merge 2 dataframes using conditions on “hour” and “min” of df1 in datetimes of df2

纵然是瞬间 提交于 2021-02-16 20:06:52
问题 I have a dataframe df.sample like this id <- c("A","A","A","A","A","A","A","A","A","A","A") date <- c("2018-11-12","2018-11-12","2018-11-12","2018-11-12","2018-11-12", "2018-11-12","2018-11-12","2018-11-14","2018-11-14","2018-11-14", "2018-11-12") hour <- c(8,8,9,9,13,13,16,6,7,19,7) min <- c(47,59,6,18,22,36,12,32,12,21,47) value <- c(70,70,86,86,86,74,81,77,79,83,91) df.sample <- data.frame(id,date,hour,min,value,stringsAsFactors = F) df.sample$date <- as.Date(df.sample$date,format="%Y-%m-

Merge 2 dataframes using conditions on “hour” and “min” of df1 in datetimes of df2

拜拜、爱过 提交于 2021-02-16 20:06:35
问题 I have a dataframe df.sample like this id <- c("A","A","A","A","A","A","A","A","A","A","A") date <- c("2018-11-12","2018-11-12","2018-11-12","2018-11-12","2018-11-12", "2018-11-12","2018-11-12","2018-11-14","2018-11-14","2018-11-14", "2018-11-12") hour <- c(8,8,9,9,13,13,16,6,7,19,7) min <- c(47,59,6,18,22,36,12,32,12,21,47) value <- c(70,70,86,86,86,74,81,77,79,83,91) df.sample <- data.frame(id,date,hour,min,value,stringsAsFactors = F) df.sample$date <- as.Date(df.sample$date,format="%Y-%m-

Using switch statement within dplyr's mutate

五迷三道 提交于 2021-02-16 20:04:13
问题 I would like to use a switch statement within dplyr's mutate. I have a simple function that performs some operations and assigns alternative values via switch, for example: convert_am <- function(x) { x <- as.character(x) switch(x, "0" = FALSE, "1" = TRUE, NA) } This works as desired when applied to scalars: >> convert_am(1) [1] TRUE >> convert_am(2) [1] NA >> convert_am(0) [1] FALSE I would like to arrive at equivalent results via mutate call: mtcars %>% mutate(am = convert_am(am)) This

How to mutate for loop in dplyr

隐身守侯 提交于 2021-02-16 19:21:17
问题 I want to create multiple lag variables for a column in a data frame for a range of values. I have code that successfully does what I want but is not scalable for what I need (hundreds of iterations) I have code below that successfully does what I want but is not scalable for what I need (hundreds of iterations) Lake_Lag <- Lake_Champlain_long.term_monitoring_1992_2016 %>% group_by(StationID,Test) %>% arrange(StationID,Test,VisitDate) %>% mutate(lag.Result1 = dplyr::lag(Result, n = 1, default

Convert percentage columns with % into numeric in R

為{幸葍}努か 提交于 2021-02-16 15:29:12
问题 I have a small dataset as follows: id price month_pct year_pct 0 1 1.85 -2.63% -5.13% 1 2 2.42 0.00% 0.83% 2 3 1.81 0.00% -0.55% 3 4 4.37 -2.89% -5.62% 4 5 1.86 0.00% -7.92% 5 6 1.78 -1.11% -15.24% I would like to convert month_pct and year_pct (which are factor type) into numeric then multiply by 100 . How could I do that in R? Thanks. id price month_pct year_pct 0 1 1.85 -2.63 -5.13 1 2 2.42 0.00 0.83 2 3 1.81 0.00 -0.55 3 4 4.37 -2.89 -5.62 4 5 1.86 0.00 -7.92 5 6 1.78 -1.11 -15.24 Code

add column to my data frame listing columns with the highest row value

前提是你 提交于 2021-02-16 14:54:12
问题 trying tell r to read through the rows of my dataframe and add the column with the highest value in the row to a new column in the dataframe called "MOST_COMMON_CANCER" I tried the following code but got an error. BASE_DF2 <- BASE_DF2%>%mutate(MOST_COMMON_CANCER=colnames(BASE_DF2[8:26])[max.col(BASE_DF2[8:26],ties.method="first")],.keep="all",.after=c_INCS_RATE) Error: Problem with `mutate()` input `MOST_COMMON_CANCER`. x Input `MOST_COMMON_CANCER` can't be recycled to size 1. i Input `MOST

Double left join in dplyr to recover values

微笑、不失礼 提交于 2021-02-16 14:29:40
问题 I've checked this issue but couldn't find a matching entry. Say you have 2 DFs: df1:mode df2:sex 1 1 2 2 3 And a DF3 where most of the combinations are not present, e.g. mode | sex | cases 1 1 9 1 1 2 2 2 7 3 1 2 1 2 5 and you want to summarise it with dplyr obtaining all combinations (with not existent ones=0): mode | sex | cases 1 1 11 1 2 5 2 1 0 2 2 7 3 1 2 3 2 0 If you do a single left_join (left_join(df1,df3) you recover the modes not in df3, but 'Sex' appears as 'NA', and the same if

How do I get mean functions to work when I use piping?

两盒软妹~` 提交于 2021-02-16 14:19:27
问题 This is probably a simple question, but I'm having trouble getting the mean function to work using dplyr. Using the mtcars dataset as an example, if I type: data(mtcars) mtcars %>% select (mpg) %>% mean() I get the "Warning message: In mean.default(.) : argument is not numeric or logical: returning NA" error message. For some reason though if I repeat the same code but just ask for a "summary", or "range" or several other statistical calculations, they work fine: data(mtcars) mtcars %>%

R filter rows based on multiple partial strings applied to multiple columns

女生的网名这么多〃 提交于 2021-02-16 14:11:15
问题 Sample of dataset: diag01 <- as.factor(c("S7211","J47","J47","K729","M2445","Z509","Z488","R13","L893","N318","L0311","S510","A047","D649")) diag02 <- as.factor(c("K590","D761","J961","T501","M8580","R268","T831","G8240","B9688","G550","E162","T8902","E86","I849")) diag03 <- as.factor(c("F058","M0820","E877","E86","G712","R32","A408","E888","G8220","C794","T68","L0310","M1094","D469")) diag04 <- as.factor(c("E86","C845","R790","I420","G4732","R600","L893","R509","T913","C795","M8412","G8212",