mutate

R dplyr: Conditional Mutate based on Groups

落爺英雄遲暮 提交于 2021-01-29 06:21:05
问题 Currently, I am working on the following problem: I am trying to split my dataset in groups and create a new variable that captures the group mean of all opposite cases that do not belong to this group - for a specific time frame. Here is a replica of my code using the mpg dataset. cars <- mpg cars$other_cty_yearly_mean <- 0 for(i in cars$cyl){ cars <- cars %>% group_by(year) %>% mutate(other_cty_yearly_mean = if_else( cyl == i, mean(cty[cyl != i]), other_cty_yearly_mean )) %>% ungroup() %>%

Creating new variables with purrr (how does one go about that?)

蹲街弑〆低调 提交于 2021-01-28 18:19:42
问题 I have a large data set, with a bunch of columns that I want to run the same function on, based on either prefix or suffix, to create a new variable. What I would like to be able to do is provide a list to map, and create new variables. dataframe <- data_frame(x_1 = c(1,2,3,4,5,6), x_2 = c(1,1,1,2,2,2), y_1 = c(200,400,120,300,100,100), y_2 = c(250,500,150,240,140,400)) newframe <- dataframe %>% mutate(x_ratio = x_1/x_2, y_ratio = y_1/y_2) In the past, i have written code in a string

Using dplyr to filter rows which contain partial string of column

余生长醉 提交于 2021-01-27 06:06:19
问题 Assuming I have a data frame like term cnt apple 10 apples 5 a apple on 3 blue pears 3 pears 1 How could I filter all partial found strings within this column, e.g. getting as a result term cnt apple 10 pears 1 without indicating to which terms I want to filter (apple|pears), but through a self-referencing manner (i.e. it does check each term against the whole column and removes terms that are a partial match). The number of tokens is not limited, nor the consistency of strings (i.e. "mapples

Using dplyr to filter rows which contain partial string of column

有些话、适合烂在心里 提交于 2021-01-27 06:05:02
问题 Assuming I have a data frame like term cnt apple 10 apples 5 a apple on 3 blue pears 3 pears 1 How could I filter all partial found strings within this column, e.g. getting as a result term cnt apple 10 pears 1 without indicating to which terms I want to filter (apple|pears), but through a self-referencing manner (i.e. it does check each term against the whole column and removes terms that are a partial match). The number of tokens is not limited, nor the consistency of strings (i.e. "mapples

Subset data based on presence/absence on unique samples and sample groups in R

若如初见. 提交于 2021-01-26 02:08:50
问题 I'd like to know which observations are present (>=1) in all samples (columns) and which are unique to each subset of samples ("Continent" or "Country"). For example: -df_all = would containin Obs5 (>=1 in all samples) df_Europe = would contain Obs1 (>=1 in Europe and =0 in Africa) df_Italy = would contain Obs2 (>=1 in Italy and 0 in the rest) etc... For the first one can use: row_sub = apply(df, 1, function(row) all(row !=0 )) dff <- df[row_sub,] BUT is there a way of coding it as a loop,

Subset data based on presence/absence on unique samples and sample groups in R

白昼怎懂夜的黑 提交于 2021-01-26 02:08:16
问题 I'd like to know which observations are present (>=1) in all samples (columns) and which are unique to each subset of samples ("Continent" or "Country"). For example: -df_all = would containin Obs5 (>=1 in all samples) df_Europe = would contain Obs1 (>=1 in Europe and =0 in Africa) df_Italy = would contain Obs2 (>=1 in Italy and 0 in the rest) etc... For the first one can use: row_sub = apply(df, 1, function(row) all(row !=0 )) dff <- df[row_sub,] BUT is there a way of coding it as a loop,

Subset data based on presence/absence on unique samples and sample groups in R

你说的曾经没有我的故事 提交于 2021-01-26 02:07:30
问题 I'd like to know which observations are present (>=1) in all samples (columns) and which are unique to each subset of samples ("Continent" or "Country"). For example: -df_all = would containin Obs5 (>=1 in all samples) df_Europe = would contain Obs1 (>=1 in Europe and =0 in Africa) df_Italy = would contain Obs2 (>=1 in Italy and 0 in the rest) etc... For the first one can use: row_sub = apply(df, 1, function(row) all(row !=0 )) dff <- df[row_sub,] BUT is there a way of coding it as a loop,

recode using dplyr::mutate across not working in a function

不羁的心 提交于 2021-01-05 07:16:10
问题 I'm trying to use dplyr::mutate(across()) to recode specified columns in a tbl . Using these on their own works fine, but I can't get them to work in a function: library(dplyr) library(tidyr) df1 <- tibble(Q7_1=1:5, Q7_1_TEXT=c("let's","see","grogu","this","week"), Q8_1=6:10, Q8_1_TEXT=rep("grogu",5), Q8_2=11:15, Q8_2_TEXT=c("grogu","is","the","absolute","best")) # this works df2 <- df1 %>% mutate(across(starts_with("Q8") & ends_with("TEXT"), ~recode(., "grogu"="mando"))) # runs without error

Change Date Format from %y-%m-%d %h:%m:%s to %Y%M%D with lubridate and mutate

天大地大妈咪最大 提交于 2021-01-03 04:09:41
问题 I've got a tbl_df with two columns StartTime and StopTime . Both are dttm . I want to change its format from "%y-%m-%d %h:%m:%s" to "%y%m%d" . I've tried data <- mutate(data, StartTime = ymd(StartTime), StopTime = ymd(StopTime)) But it returns Warning messages: 1: All formats failed to parse. No formats found. 2: All formats failed to parse. No formats found. How can I do it? Please, don't send other questions that don't use lubridate package. Thanks 回答1: I think this should work library

Change Date Format from %y-%m-%d %h:%m:%s to %Y%M%D with lubridate and mutate

冷暖自知 提交于 2021-01-03 04:06:29
问题 I've got a tbl_df with two columns StartTime and StopTime . Both are dttm . I want to change its format from "%y-%m-%d %h:%m:%s" to "%y%m%d" . I've tried data <- mutate(data, StartTime = ymd(StartTime), StopTime = ymd(StopTime)) But it returns Warning messages: 1: All formats failed to parse. No formats found. 2: All formats failed to parse. No formats found. How can I do it? Please, don't send other questions that don't use lubridate package. Thanks 回答1: I think this should work library