tidyverse

How can I mutate multiple variables using dplyr?

旧城冷巷雨未停 提交于 2021-01-28 08:15:52
问题 Given a tbl_df object df containing multiple variables (i.e. Var.50, Var.100, Var.150 and Var.200), measured twice (i.e. P1 and P2), I want to mutate a new set of the same variables from repeated measurements (for example, average P1 and P2, creating P3 for each corresponding variable). Similar questions have been asked before, but there does not seem to have clear answers using dplyr. Example data: df <- structure(list(P1.Var.50 = c(134.242050170898, 52.375, 177.126017252604 ), P1.Var.100 =

In R, use nonstandard evaluation to select specific variables from data.frames

非 Y 不嫁゛ 提交于 2021-01-28 06:21:51
问题 I've got several large-ish data.frames set up like a relational database, and I'd like to make a single function to look for whatever variable I need and grab it from that particular data.frame and add it to the data.frame I'm currently working on. I've got a way to do this that works, but it requires temporarily making a list of all the data.frames, which seems inefficient. I suspect that nonstandard evaluation would solve this problem for me, but I'm not sure how to do it. Here's what works

Cumulative aggregates within tidyverse

◇◆丶佛笑我妖孽 提交于 2021-01-28 05:51:51
问题 say I have a tibble (or data.table ) which consists of two columns: a <- tibble(id = rep(c("A", "B"), each = 6), val = c(1, 0, 0, 1 ,0,1,0,0,0,1,1,1)) Furthermore I have a function called myfun which takes a numeric vector of arbitrary length as input and returns a single number. For example, you can think of myfun as being the standard deviation. Now I would like to create a third column to my tibble (called result) which contains the outputs of myfun applied to val cumulated and grouped

tidyverse - delete a column within a nested column/list

社会主义新天地 提交于 2021-01-28 05:24:50
问题 I have the following data: (Note: I'm using the current github version of dplyr within tidyverse which offerse some new experimental functions, like condense - which I'm using below, but I think that's not relevant for my problem/question). library(tidyverse) library(corrr) dat <- data.frame(grp = rep(1:4, each = 25), Q1 = sample(c(1:5, NA), 100, replace = TRUE), Q2 = sample(c(1:5, NA), 100, replace = TRUE), Q3 = sample(c(1:5, NA), 100, replace = TRUE), Q4 = sample(c(1:5, NA), 100, replace =

R How to Pass a function as a String Inside another Function

落花浮王杯 提交于 2021-01-28 02:57:46
问题 Any assistance on this little conundrum would be mightily appreciated thanks. I am trying to pass an argument to the tq_transmute function from the tidyquant package; the value for the argument is a function, however I would like to pass it as a string (out with the scope of the example below I’ll be passing it via a Shiny selectInput ). I have tried every way I can think of to turn the string 'apply.quarterly' into the object apply.quarterly accepted by the mutate_fun argument. The commented

Using mutate over multiples columns with a for loop to recode values

末鹿安然 提交于 2021-01-28 01:54:12
问题 I need to recode values over multiple columns of a data frame based on another table. I have to recode the values of multiple columns of a data table using a side table. The values correspond to geographic identifiers that I must replace with place names. So I decided to do a loop but what works outside the loop doesn't work anymore . I can't use mutate in for loop. My real data contains 274 columns with 38 columns to recode. This columns have many different names (they aren't call places")

How to do cumulative filtering with `purrr::accumulate`?

|▌冷眼眸甩不掉的悲伤 提交于 2021-01-28 01:51:09
问题 I'm looking for an approach to do something like this # this doesnt work # accumulate(1:8, ~filter(mtcars, carb >= .x)) So that I can examine some summary statistics at different cutoff values. I could simply do # this works but redundant filtering is done map2(list(mtcars), 1:8, ~filter(.x, carb >= .y)) But since my data is rather large, it doesn't make sense to filter out values that were already filtered out in the step just before. In essence, this just duplicates the original dataframe a

Unclear warning when defining custom pipe operator

爱⌒轻易说出口 提交于 2021-01-27 21:41:12
问题 In my process I need to perform many dplyr::inner_join s. Thought I might define a custom pipe operator for it as explained here: library(tidyverse) library(rlang) df1 <- tibble(a = 1:10, b = 11:20) df2 <- tibble(a = 1:10, c = 21:30) `%J>%` <- function(lhs, rhs){ inner_join(lhs, rhs) } df1 %J>% df2 This works as expected and I get: Joining, by = "a" # A tibble: 10 x 3 a b c <int> <int> <int> 1 1 11 21 2 2 12 22 3 3 13 23 4 4 14 24 5 5 15 25 6 6 16 26 7 7 17 27 8 8 18 28 9 9 19 29 10 10 20 30

(R) Cleaner way to use map() with list-columns

纵饮孤独 提交于 2021-01-27 18:31:08
问题 I am trying to move away from rowwise() for list columns as I have heard that the tidyverse team is in the process of axing it. However, I am not used to using the purrr functions so I feel like there must be a better way of doing the following: I create a list-column containing a tibble for each species. I then want to go into the tibble and take the mean of certain variables. The first case is using map and second is the rowwise solution that I personally feel is cleaner. Does anyone know a

Reorder factors by increasing frequency

亡梦爱人 提交于 2021-01-27 11:51:37
问题 How do I reorder factor-valued columns by frequency - in increasing order? While the forcats package provides an explicit way to reorder a factor based on its frequency (fct_infreq()), it does so in decreasing frequency order. I need the reverse order of the factor frequency/counts. E.g. library(forcats) set.seed(555) df <- data.frame(x=factor(sample(as.character(1:10), 100, replace=TRUE))) table(df$x) 1 10 2 3 4 5 6 7 8 9 9 10 12 14 10 10 5 12 8 10 levels(fct_infreq(df$x)) [1] "3" "2" "7"