Aggregate data frame while keeping original order, in a simple manner

前端 未结 4 1017
囚心锁ツ
囚心锁ツ 2021-02-15 12:38

I\'m having some trouble aggregating a data frame while keeping the groups in their original order (order based on first appearance in data frame). I\'ve managed to get it right

4条回答
  •  甜味超标
    2021-02-15 13:11

    Not sure how this solution is for speed and storage capacity etc. for large datasets, but I thought it was a pretty easy way for solving this problem.

    # Create dataframe
    x <- c("C", "C", "A", "A", "A","B", "B")
    y <- c(5, 6, 3, 2, 7, 8, 9)
    df <- data.frame(x, y)
    df
    

    Original dataframe:

      x y
    1 C 5
    2 C 6
    3 A 3
    4 A 2
    5 A 7
    6 B 8
    7 B 9
    

    Solution:

    # Add column with the original order
    order <- seq(1:length(df$x))
    df$order <- order
    
    # Aggregate
    # use sum for column Y (the variable you want to aggregate according to X)
    df1 <- aggregate(y~x,data=df,FUN=sum)
    # use mean for column 'order'
    df2 <- aggregate(order~x, data=df,FUN=mean)
    
    # Add the mean of order values to the dataframe
    df <- df1
    df$order <- df2$order
    
    # Order the dataframe according the the mean of order values
    df <- df[order(df$order),]
    df
    

    Aggregated dataframe with same order:

      x  y order
    3 C 11   1.5
    1 A 12   4.0
    2 B 17   6.5
    

提交回复
热议问题