tidyverse | 易学教程

R pivot_wider to keep one id per row [duplicate]

阅读更多关于 R pivot_wider to keep one id per row [duplicate]

问题 This question already has answers here : Transpose / reshape dataframe without “timevar” from long to wide format (8 answers) Closed 7 months ago . I have a dataset with IDs and Value where one ID could take multiple values. Currently, the same ID is repeated row-wise when it has multiple values, but I hope to keep one ID per row, adding more columns when necessary. Here's a reproducible example: df <- data.frame(id = c(1,1,1,2,3,3), val = c(10:15)) What I want is df2 <- data.frame(id = c(1:3

R pivot_wider to keep one id per row [duplicate]

阅读更多关于 R pivot_wider to keep one id per row [duplicate]

Remove duplicate rows based on multiple columns using dplyr / tidyverse?

阅读更多关于 Remove duplicate rows based on multiple columns using dplyr / tidyverse?

问题 I would like to remove duplicate rows based on >1 column using dplyr / tidyverse Example library(dplyr) df <- data.frame(a=c(1,1,1,2,2,2), b=c(1,2,1,2,1,2), stringsAsFactors = F) I thought this would return rows 3 and 6, but it returns 0 rows. df %>% filter(duplicated(a, b)) # [1] a b # <0 rows> (or 0-length row.names) Conversely, I thought this would return rows 1,2,4 and 5, but it returns all rows. df %>% filter(!duplicated(a, b)) # a b # 1 1 1 # 2 1 2 # 3 1 1 # 4 2 2 # 5 2 1 # 6 2 2 What

How can I modify these dplyr code for multiple linear regression by combination of all variables in R

阅读更多关于 How can I modify these dplyr code for multiple linear regression by combination of all variables in R

问题 lets say I have following data ind1 <- rnorm(99) ind2 <- rnorm(99) ind3 <- rnorm(99) ind4 <- rnorm(99) ind5 <- rnorm(99) dep <- rnorm(99, mean=ind1) group <- rep(c("A", "B", "C"), each=33) df <- data.frame(dep,group, ind1, ind2, ind3, ind4, ind5) the following code is calculating multiple linear regression between dependend variable and 2 independent variables by group which is exactly what I want to do. But I want to regress dep variable against all combination pair of independent variables

Tidyverse: Replacing entire strings based on partial matches

阅读更多关于 Tidyverse: Replacing entire strings based on partial matches

问题 I'm looking to replace entire string entries within data based on partial matches using functions in the stringr package. The only method I've tried has been replacing exact matches using str_replace_all() but this becomes tedious and unwieldy when there are dozens of variations to correct for. I'm looking to replace based on partial matches. In my reprex below, I replace variants of "Spaniard" and "Colombian" by direct specification. However, I would love to perform those replacements based

Tidyverse: Replacing entire strings based on partial matches

阅读更多关于 Tidyverse: Replacing entire strings based on partial matches

seasonal ggplot in R?

阅读更多关于 seasonal ggplot in R?

问题 I am looking at data from Nov to April and would like to have a plot starting from Nov to April. Below is my sample code to screen out month of interests. library(tidyverse) mydata = data.frame(seq(as.Date("2010-01-01"), to=as.Date("2011-12-31"),by="days"), A = runif(730,10,50)) colnames(mydata) = c("Date", "A") DF = mydata %>% mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>% filter(Month == 11 | Month == 12 | Month == 01 | Month == 02 | Month == 03 | Month == 04) I tried

Combine dfs by common column importing selected columns in R

阅读更多关于 Combine dfs by common column importing selected columns in R

问题 I would like to merge data.frames by common "names" column but only selecting the "PA" columns. df <- data.frame(names=c("Obs1", "Obs2", "Obs3", "Obs4", "Obs5"), `S1`=c(1,2,2,0,1), `S2`=c(2,50,40,30,22), `S3`=c( 0,100,135,256,303), `S4`=c(0,10,17,73,74),check.names=FALSE) df2<- data.frame(names=c("Obs1", "Obs2", "Obs3", "Obs4", "Obs5"), AB=c(0,30,30,40,2), PA=c(2,4,5,6,7)) df3<- data.frame(names=c("Obs1", "Obs2", "Obs3", "Obs4", "Obs5"), AB=c(100,300,300,400,200), PA=c(3,5,7,8,7)) df4<- data

Many regressions using tidyverse and broom: Same dependent variable, different independent variables

阅读更多关于 Many regressions using tidyverse and broom: Same dependent variable, different independent variables

问题 This link shows how to answer my question in the case where we have the same independent variables, but potentially many different dependent variables: Use broom and tidyverse to run regressions on different dependent variables. But my question is, how can I apply the same approach (e.g., tidyverse and broom) to run many regressions where we have the reverse situation: same dependent variables but different independent variable. In line with the code in the previous link, something like: mod

Many regressions using tidyverse and broom: Same dependent variable, different independent variables

阅读更多关于 Many regressions using tidyverse and broom: Same dependent variable, different independent variables