R data.frame: rowSums of selected columns by grouping vector

不羁岁月 提交于 2019-12-24 03:33:12

问题


I have a data frame with a sequence of numeric columns, surrounded on both sides by (irrelevant) columns of characters. I want to obtain a new data frame that keeps the position of the irrelevant columns, and adds the numeric columns to eachother by a certain grouping vector (or applies some other row-wise function to the data frame, by group). Example:

sample = data.frame(cha1 = c("A","B"),num1=1:2,num2=3:4,num3=11:12,num4=13:14,cha2=c("C","D"))
> sample
  cha1 num1 num2 num3 num4 cha2
1    A    1    3   11   13    C
2    B    2    4   12   14    D

with the goal to obtain

> goal
  cha1 X1 X2 cha2 
1    A  4 24    C
2    B  6 26    D

i.e. I've summed the 4 numeric columns according to the grouping vector gl(2,2,4) = (1,1,2,2) [levels: 1,2]

For a purely numeric data frame I've found the following method:

sample_num = sample[,2:5] #select numeric columns
data.frame(t(apply(sample_num,1,function(row) tapply(row, INDEX=gl(2,2,4),sum))))

I could combine this with re-inserting the character columns to give the intended result, but I'm really looking for a more elegant way. I'm particularly interested in a plyr method if there is one, as I'm trying to migrate to plyr for all my data frame manipulations. I imagine the first step would be to cast the data frame into long format, but I have no idea how to proceed from there.

One 'absolute' requirement is that I cannot do without the gl(n,k,l) method of grouping, as I need this to be applicable to a wide range of data frames and grouping factors.

EDIT: for simplicity assume that I know which columns are the relevant numeric columns. I'm not concerned with how to select them, I'm concerned with how to do my grouped sum without messing up the original data frame structure.

Thanks!


回答1:


Grpindex<-gl(2,2,4)    
goal<-cbind.data.frame(sample["cha1"],(t(rowsum(t(sample[,2:5]), paste0("X",Grpindex)))),sample["cha2"])

Output:

  cha1 X1 X2 cha2
1    A  4 24    C
2    B  6 26    D


来源:https://stackoverflow.com/questions/30777678/r-data-frame-rowsums-of-selected-columns-by-grouping-vector

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!