combine two data frames and aggregate

☆樱花仙子☆ 提交于 2021-02-05 06:35:07

问题


I am having 2 data frames in the below format:

dt1

id     col1    col2    col3    col4 
___    ____    ____    _____   _____
 1      2       3       1       2
 2      3       4       1       1
 3      1       1       1       1
 4      1       2       1       2
 5      1       1       1       1
 6      1       2       1       2

dt2 

id     col1    col2    col3    col4 
___    ____    ____    _____   _____
 1      1       3       1       2
 2      3       4       1       0
 4      1       1       1       1
 6      1       2       1       2
 9      2       1       1       1
12      1       2       1       2

and I want to aggregate and combine these two data frames by the id and the resulting dataframe like

dt3

 id     col1    col2    col3    col4 
    ___    ____    ____    _____   _____
     1      3       6       2       4
     2      6       8       2       1
     3      1       1       1       1
     4      2       3       2       3
     5      1       1       1       1
     6      2       4       2       4
     9      2       1       1       1
    12      1       2       1       2

I tried with dt3=merge(dt1,dt2,all=TRUE) but did not work.Also tried dt3=merge(dt1,dt2,by=id) too did not work.Any help is appreciated.


回答1:


We can use rbindlist in data.table and get the sum of each column after grouping by 'id'

library(data.table)
rbindlist(mget(paste0('dt', 1:2)))[, lapply(.SD, sum), by = id]
#    id col1 col2 col3 col4
#1:  1    3    6    2    4
#2:  2    6    8    2    1
#3:  3    1    1    1    1
#4:  4    2    3    2    3
#5:  5    1    1    1    1
#6:  6    2    4    2    4
#7:  9    2    1    1    1
#8: 12    1    2    1    2

Or using bind_rows with group_by and summarise_each from tidyverse

librarydplyr)
bind_rows(dt1, dt2) %>%
          group_by(id) %>%
          summarise_each(funs(sum))



回答2:


The magic word you're looking for is rbind: dt3 = rbind(dt1, dt2)




回答3:


Since they have the same format and the columns match up put them in row by row.

dt3 <- data.frame(dt1)

dt3 <- rbind(dt2) # rbind lines up your observations row by row.

You could probably put that all in one line

dt3 <- data.frame(rbind(dt1, dt2))




回答4:


Here is a dplyr solution:

library(dplyr)
bind_rows(dt1, dt2) %>% group_by(id) %>% 
  summarise_all(sum)

Data

dt1  <- structure(
  list(id = 1:6, col1 = c(2L, 3L, 1L, 1L, 1L, 1L), 
       col2 = c(3L, 4L, 1L, 2L, 1L, 2L), 
       col3 = c(1L, 1L, 1L, 1L, 1L, 1L), 
       col4 = c(2L, 1L, 1L, 2L, 1L, 2L)), 
  .Names = c("id", "col1", "col2", "col3",  "col4"), 
  class = "data.frame", row.names = c(NA, -6L))


dt2 <- structure(
  list(id = c(1L, 2L, 4L, 6L, 9L, 12L), 
       col1 = c(1L, 3L, 1L, 1L, 2L, 1L), 
       col2 = c(3L, 4L, 1L, 2L, 1L, 2L), 
       col3 = c(1L, 1L, 1L, 1L, 1L, 1L), 
       col4 = c(2L, 0L, 1L, 2L, 1L, 2L)), 
  .Names = c("id", "col1", "col2", "col3", "col4"), 
  class = "data.frame", row.names = c(NA, -6L))


来源:https://stackoverflow.com/questions/42966535/combine-two-data-frames-and-aggregate

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!