mutate | 易学教程

Calculate all the absolute differences between 6 columns of a table using mutate? [duplicate]

阅读更多关于 Calculate all the absolute differences between 6 columns of a table using mutate? [duplicate]

问题 This question already has answers here : Pairwise subtraction in a dataframe R (2 answers) Closed 7 months ago . I have a table with 6 columns Z1 to Z6, and I want to calculate the absolute value of the difference between each of these columns. So far, I enumerate all the differences in a mutate command: FactArray <- FactArray %>% mutate(diff12 = abs(Z1-Z2), diff13 = abs(Z1-Z3), diff14 = abs(Z1-Z4), diff15 = abs(Z1-Z5), diff16 = abs(Z1-Z6), diff23 = abs(Z2-Z3), diff24 = abs(Z2-Z4), diff25 =

Add column to data frame based on long list and values in another column is too slow

阅读更多关于 Add column to data frame based on long list and values in another column is too slow

问题 I am adding a new column to a dataframe using apply() and mutate. It works. Unfortunately, it is very slow. I have 24M rows and I am adding column based on values in a long (58 items). It was bearable with smaller list. Not anymore. Here is my example large_df <-data.frame(A=(1:4), B= c('a','b','c','d'), C= c('e','f','g','h')) long_list = c('e','f','g') large_df =mutate (large_df, new_C = apply(large_df[,2:3], 1, function(r) any(r %in% long_list))) The new column (new_C) will read True or

R mutate multiple columns with if statement

阅读更多关于 R mutate multiple columns with if statement

问题 I have data like this: cols <- c("X01_01","X01_01_p", "X01_02","X01_02_p", "X01_03","X01_03_p", "X01_04", "X01_05","X01_06") set.seed(111) values <- replicate(9, sample(1:5, 4, replace = TRUE)) df <- as.data.frame(values) So my df looks like this: X01_01 X01_01_p X01_02 X01_02_p X01_03 X01_03_p X01_04 X01_05 X01_06 1 3 2 3 1 1 3 5 4 3 2 4 3 1 1 5 2 2 3 3 3 2 1 3 1 2 2 4 1 2 4 3 3 3 3 4 2 2 3 4 I have some columns to use for mutation (not all) and the names of the new columns. cols_to_mutate <

using mutate_at with the in operator %in%

阅读更多关于 using mutate_at with the in operator %in%

问题 I have a data frame with a few variables to reverse code. I have a separate vector that has all the variables to reverse code. I'd like to use mutate_at(), or some other tidy way, to reverse code them all in one line of code. Here's the dataset and the vector of items to reverse library(tidyverse) mock_data <- tibble(id = 1:5, item_1 = c(1, 5, 3, 5, 5), item_2 = c(4, 4, 4, 1, 1), item_3 = c(5, 5, 5, 5, 1)) reverse <- c("item_2", "item_3") Here's what I want it to look like with only items 2

using mutate_at with the in operator %in%

阅读更多关于 using mutate_at with the in operator %in%

How to get value of last non-NA column [duplicate]

阅读更多关于 How to get value of last non-NA column [duplicate]

问题 This question already has answers here : Extract last non-missing value in row with data.table (5 answers) Closed 7 months ago . A bit difficult to explain, but I have a dataframe with values that look like a staircase - for every date, there are different columns that have NA for some dates. I want to create a new column that has the last non-NA column value in it. Hopefuly it makes more sense with this example: Sample dataframe: test <- data.frame("date" = c(as.Date("2020-01-01"), as.Date(

How to get value of last non-NA column [duplicate]

阅读更多关于 How to get value of last non-NA column [duplicate]

Conditionally replace values of multiple columns, from values of other multiple columns

阅读更多关于 Conditionally replace values of multiple columns, from values of other multiple columns

问题 Suppose I have this dataset: set.seed (1234); data.frame(cbind(a=rep(c("si","no"),30),b=rnorm(60)), c=rep(c("d","e","f"),20)) %>% head() Then I want to add many columns (in this example I only added two), to identify distinct cases between each group (in this case, column "a"). set.seed(1234); data.frame(cbind(a=rep(c("si","no"),30),b=rnorm(60)),c=rep(c("d","e","f"),20)) %>% group_by(a) %>% dplyr::mutate_at(vars(c(b,c)), .funs= list(dups_hash_ing= ~n_distinct(.))) This code leaves the

Conditionally replace values of multiple columns, from values of other multiple columns

阅读更多关于 Conditionally replace values of multiple columns, from values of other multiple columns

Mutate column as input to sample

阅读更多关于 Mutate column as input to sample

问题 I want create a distribution using the sample() function with the probability of each value defined by a data.frame() column. When I try the code below however it produces the error: Error in sample.int(length(x), size, replace, prob) : incorrect number of probabilities I'm guessing this is because mutate is passing the entire column and not just one integer. Can anyone help me with this? library(dplyr) input = data.frame(input = 1:100) output = input %>% mutate(output = sum( sample(0:1, 10,