mutate

Calculate all the absolute differences between 6 columns of a table using mutate? [duplicate]

眉间皱痕 提交于 2021-02-07 04:13:28
问题 This question already has answers here : Pairwise subtraction in a dataframe R (2 answers) Closed 7 months ago . I have a table with 6 columns Z1 to Z6, and I want to calculate the absolute value of the difference between each of these columns. So far, I enumerate all the differences in a mutate command: FactArray <- FactArray %>% mutate(diff12 = abs(Z1-Z2), diff13 = abs(Z1-Z3), diff14 = abs(Z1-Z4), diff15 = abs(Z1-Z5), diff16 = abs(Z1-Z6), diff23 = abs(Z2-Z3), diff24 = abs(Z2-Z4), diff25 =

Add column to data frame based on long list and values in another column is too slow

半城伤御伤魂 提交于 2021-02-05 11:39:25
问题 I am adding a new column to a dataframe using apply() and mutate. It works. Unfortunately, it is very slow. I have 24M rows and I am adding column based on values in a long (58 items). It was bearable with smaller list. Not anymore. Here is my example large_df <-data.frame(A=(1:4), B= c('a','b','c','d'), C= c('e','f','g','h')) long_list = c('e','f','g') large_df =mutate (large_df, new_C = apply(large_df[,2:3], 1, function(r) any(r %in% long_list))) The new column (new_C) will read True or

R mutate multiple columns with if statement

二次信任 提交于 2021-02-05 09:11:57
问题 I have data like this: cols <- c("X01_01","X01_01_p", "X01_02","X01_02_p", "X01_03","X01_03_p", "X01_04", "X01_05","X01_06") set.seed(111) values <- replicate(9, sample(1:5, 4, replace = TRUE)) df <- as.data.frame(values) So my df looks like this: X01_01 X01_01_p X01_02 X01_02_p X01_03 X01_03_p X01_04 X01_05 X01_06 1 3 2 3 1 1 3 5 4 3 2 4 3 1 1 5 2 2 3 3 3 2 1 3 1 2 2 4 1 2 4 3 3 3 3 4 2 2 3 4 I have some columns to use for mutation (not all) and the names of the new columns. cols_to_mutate <

using mutate_at with the in operator %in%

安稳与你 提交于 2021-02-05 09:01:35
问题 I have a data frame with a few variables to reverse code. I have a separate vector that has all the variables to reverse code. I'd like to use mutate_at(), or some other tidy way, to reverse code them all in one line of code. Here's the dataset and the vector of items to reverse library(tidyverse) mock_data <- tibble(id = 1:5, item_1 = c(1, 5, 3, 5, 5), item_2 = c(4, 4, 4, 1, 1), item_3 = c(5, 5, 5, 5, 1)) reverse <- c("item_2", "item_3") Here's what I want it to look like with only items 2

using mutate_at with the in operator %in%

梦想与她 提交于 2021-02-05 09:01:34
问题 I have a data frame with a few variables to reverse code. I have a separate vector that has all the variables to reverse code. I'd like to use mutate_at(), or some other tidy way, to reverse code them all in one line of code. Here's the dataset and the vector of items to reverse library(tidyverse) mock_data <- tibble(id = 1:5, item_1 = c(1, 5, 3, 5, 5), item_2 = c(4, 4, 4, 1, 1), item_3 = c(5, 5, 5, 5, 1)) reverse <- c("item_2", "item_3") Here's what I want it to look like with only items 2

How to get value of last non-NA column [duplicate]

≡放荡痞女 提交于 2021-02-05 08:38:13
问题 This question already has answers here : Extract last non-missing value in row with data.table (5 answers) Closed 7 months ago . A bit difficult to explain, but I have a dataframe with values that look like a staircase - for every date, there are different columns that have NA for some dates. I want to create a new column that has the last non-NA column value in it. Hopefuly it makes more sense with this example: Sample dataframe: test <- data.frame("date" = c(as.Date("2020-01-01"), as.Date(

How to get value of last non-NA column [duplicate]

徘徊边缘 提交于 2021-02-05 08:37:29
问题 This question already has answers here : Extract last non-missing value in row with data.table (5 answers) Closed 7 months ago . A bit difficult to explain, but I have a dataframe with values that look like a staircase - for every date, there are different columns that have NA for some dates. I want to create a new column that has the last non-NA column value in it. Hopefuly it makes more sense with this example: Sample dataframe: test <- data.frame("date" = c(as.Date("2020-01-01"), as.Date(

Conditionally replace values of multiple columns, from values of other multiple columns

有些话、适合烂在心里 提交于 2021-02-04 06:57:47
问题 Suppose I have this dataset: set.seed (1234); data.frame(cbind(a=rep(c("si","no"),30),b=rnorm(60)), c=rep(c("d","e","f"),20)) %>% head() Then I want to add many columns (in this example I only added two), to identify distinct cases between each group (in this case, column "a"). set.seed(1234); data.frame(cbind(a=rep(c("si","no"),30),b=rnorm(60)),c=rep(c("d","e","f"),20)) %>% group_by(a) %>% dplyr::mutate_at(vars(c(b,c)), .funs= list(dups_hash_ing= ~n_distinct(.))) This code leaves the

Conditionally replace values of multiple columns, from values of other multiple columns

心已入冬 提交于 2021-02-04 06:52:59
问题 Suppose I have this dataset: set.seed (1234); data.frame(cbind(a=rep(c("si","no"),30),b=rnorm(60)), c=rep(c("d","e","f"),20)) %>% head() Then I want to add many columns (in this example I only added two), to identify distinct cases between each group (in this case, column "a"). set.seed(1234); data.frame(cbind(a=rep(c("si","no"),30),b=rnorm(60)),c=rep(c("d","e","f"),20)) %>% group_by(a) %>% dplyr::mutate_at(vars(c(b,c)), .funs= list(dups_hash_ing= ~n_distinct(.))) This code leaves the

Mutate column as input to sample

旧城冷巷雨未停 提交于 2021-01-29 12:14:22
问题 I want create a distribution using the sample() function with the probability of each value defined by a data.frame() column. When I try the code below however it produces the error: Error in sample.int(length(x), size, replace, prob) : incorrect number of probabilities I'm guessing this is because mutate is passing the entire column and not just one integer. Can anyone help me with this? library(dplyr) input = data.frame(input = 1:100) output = input %>% mutate(output = sum( sample(0:1, 10,