Merging data frames without duplicating rows

后端 未结 1 1680
夕颜
夕颜 2021-02-08 13:14

I would like to merge two data frames, but do not want to duplicate rows if there is more than one match. Instead I would like to sum the observations on that day.

1条回答
  •  野的像风
    2021-02-08 13:24

    I'd suggest you merge them and then aggregate them (essentially perform a SUM for each unique Date).

    df <- merge(z.days,obs.days, by.x="Date", by.y="Date", all.x=TRUE)
            Date Count
    1 2012-01-01    NA
    2 2012-01-02     1
    3 2012-01-03     1
    4 2012-01-03     1
    5 2012-01-04    NA
    

    Now to do the merge you could use aggregate:

    df2 <- aggregate(df$Count,list(df$Date),sum)
         Group.1  x
    1 2012-01-01 NA
    2 2012-01-02  1
    3 2012-01-03  2
    4 2012-01-04 NA
    names(df2)<-names(df)
    

    BUT I'd recommend package plyr, which is awesome! In particular, function ddply.

    library(plyr)
    ddply(df,.(Date),function(x) data.frame(Date=x$Date[1],Count=sum(x$Count)))
            Date Count
    1 2012-01-01    NA
    2 2012-01-02     1
    3 2012-01-03     2
    4 2012-01-04    NA
    

    The command ddply(df,.(Date),FUN) essentially does:

    for each date in unique(df$Date):
        add to output dataframe FUN( df[df$Date==date,] )
    

    So the function I've provided creates a data frame of one row with columns Date and Count, being the sum of all counts for that date.

    0 讨论(0)
提交回复
热议问题