tidyverse

R pivot_wider to keep one id per row [duplicate]

流过昼夜 提交于 2021-02-08 12:01:49
问题 This question already has answers here : Transpose / reshape dataframe without “timevar” from long to wide format (8 answers) Closed 7 months ago . I have a dataset with IDs and Value where one ID could take multiple values. Currently, the same ID is repeated row-wise when it has multiple values, but I hope to keep one ID per row, adding more columns when necessary. Here's a reproducible example: df <- data.frame(id = c(1,1,1,2,3,3), val = c(10:15)) What I want is df2 <- data.frame(id = c(1:3

R pivot_wider to keep one id per row [duplicate]

狂风中的少年 提交于 2021-02-08 12:01:22
问题 This question already has answers here : Transpose / reshape dataframe without “timevar” from long to wide format (8 answers) Closed 7 months ago . I have a dataset with IDs and Value where one ID could take multiple values. Currently, the same ID is repeated row-wise when it has multiple values, but I hope to keep one ID per row, adding more columns when necessary. Here's a reproducible example: df <- data.frame(id = c(1,1,1,2,3,3), val = c(10:15)) What I want is df2 <- data.frame(id = c(1:3

Remove duplicate rows based on multiple columns using dplyr / tidyverse?

折月煮酒 提交于 2021-02-08 11:39:43
问题 I would like to remove duplicate rows based on >1 column using dplyr / tidyverse Example library(dplyr) df <- data.frame(a=c(1,1,1,2,2,2), b=c(1,2,1,2,1,2), stringsAsFactors = F) I thought this would return rows 3 and 6, but it returns 0 rows. df %>% filter(duplicated(a, b)) # [1] a b # <0 rows> (or 0-length row.names) Conversely, I thought this would return rows 1,2,4 and 5, but it returns all rows. df %>% filter(!duplicated(a, b)) # a b # 1 1 1 # 2 1 2 # 3 1 1 # 4 2 2 # 5 2 1 # 6 2 2 What

How can I modify these dplyr code for multiple linear regression by combination of all variables in R

坚强是说给别人听的谎言 提交于 2021-02-08 10:21:40
问题 lets say I have following data ind1 <- rnorm(99) ind2 <- rnorm(99) ind3 <- rnorm(99) ind4 <- rnorm(99) ind5 <- rnorm(99) dep <- rnorm(99, mean=ind1) group <- rep(c("A", "B", "C"), each=33) df <- data.frame(dep,group, ind1, ind2, ind3, ind4, ind5) the following code is calculating multiple linear regression between dependend variable and 2 independent variables by group which is exactly what I want to do. But I want to regress dep variable against all combination pair of independent variables

Tidyverse: Replacing entire strings based on partial matches

為{幸葍}努か 提交于 2021-02-08 08:33:27
问题 I'm looking to replace entire string entries within data based on partial matches using functions in the stringr package. The only method I've tried has been replacing exact matches using str_replace_all() but this becomes tedious and unwieldy when there are dozens of variations to correct for. I'm looking to replace based on partial matches. In my reprex below, I replace variants of "Spaniard" and "Colombian" by direct specification. However, I would love to perform those replacements based

Tidyverse: Replacing entire strings based on partial matches

荒凉一梦 提交于 2021-02-08 08:32:13
问题 I'm looking to replace entire string entries within data based on partial matches using functions in the stringr package. The only method I've tried has been replacing exact matches using str_replace_all() but this becomes tedious and unwieldy when there are dozens of variations to correct for. I'm looking to replace based on partial matches. In my reprex below, I replace variants of "Spaniard" and "Colombian" by direct specification. However, I would love to perform those replacements based

seasonal ggplot in R?

强颜欢笑 提交于 2021-02-08 08:17:00
问题 I am looking at data from Nov to April and would like to have a plot starting from Nov to April. Below is my sample code to screen out month of interests. library(tidyverse) mydata = data.frame(seq(as.Date("2010-01-01"), to=as.Date("2011-12-31"),by="days"), A = runif(730,10,50)) colnames(mydata) = c("Date", "A") DF = mydata %>% mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>% filter(Month == 11 | Month == 12 | Month == 01 | Month == 02 | Month == 03 | Month == 04) I tried

Combine dfs by common column importing selected columns in R

烈酒焚心 提交于 2021-02-08 05:31:37
问题 I would like to merge data.frames by common "names" column but only selecting the "PA" columns. df <- data.frame(names=c("Obs1", "Obs2", "Obs3", "Obs4", "Obs5"), `S1`=c(1,2,2,0,1), `S2`=c(2,50,40,30,22), `S3`=c( 0,100,135,256,303), `S4`=c(0,10,17,73,74),check.names=FALSE) df2<- data.frame(names=c("Obs1", "Obs2", "Obs3", "Obs4", "Obs5"), AB=c(0,30,30,40,2), PA=c(2,4,5,6,7)) df3<- data.frame(names=c("Obs1", "Obs2", "Obs3", "Obs4", "Obs5"), AB=c(100,300,300,400,200), PA=c(3,5,7,8,7)) df4<- data

Many regressions using tidyverse and broom: Same dependent variable, different independent variables

风格不统一 提交于 2021-02-08 04:50:44
问题 This link shows how to answer my question in the case where we have the same independent variables, but potentially many different dependent variables: Use broom and tidyverse to run regressions on different dependent variables. But my question is, how can I apply the same approach (e.g., tidyverse and broom) to run many regressions where we have the reverse situation: same dependent variables but different independent variable. In line with the code in the previous link, something like: mod

Many regressions using tidyverse and broom: Same dependent variable, different independent variables

佐手、 提交于 2021-02-08 04:49:40
问题 This link shows how to answer my question in the case where we have the same independent variables, but potentially many different dependent variables: Use broom and tidyverse to run regressions on different dependent variables. But my question is, how can I apply the same approach (e.g., tidyverse and broom) to run many regressions where we have the reverse situation: same dependent variables but different independent variable. In line with the code in the previous link, something like: mod