Hi experienced R users,
It\'s kind of a simple thing.
I want to sum x
by Group.1
depending on one controllable variable.
I\'d like
If you want to sum only a subset of your data:
my_data <- data.frame(c("TRUE","FALSE","FALSE","FALSE","TRUE"), c(1,2,3,4,5))
names(my_data)[1] <- "DESCRIPTION" #Change Column Name
names(my_data)[2] <- "NUMBER" #Change Column Name
sum(subset(my_data, my_data$DESCRIPTION=="TRUE")$NUMBER)
You should get 6.
Not sure why Eggs
are important here ;)
df1 <- data.frame(Gr=seq(4),
x=c(230299, 263066, 266504, 177196)
)
now with n=2
i.e. first two rows:
n <- 2
sum(df1[, "x"][df1[, "Gr"]<=n])
The expression [df1[, "Gr"]<=n]
creates a logical vector to subset the elements in df1[, "x"]
before sum
ming them.
Also, it appears your Group.1
is the same as the row no. If so this may be simpler:
sum(df1[, "x"][1:n])
or to get all at once
cumsum(df1[, "x"])
Assuming your data is in mydata
:
with(mydata, sum(x[Group.1 <= 2])
If the sums you want are always cumulative, there's a function for that, cumsum
. It works like this.
> cumsum(c(1,2,3))
[1] 1 3 6
In this case you might want something like
> mysum <- cumsum(yourdata$x)
> mysum[2] # the sum of the first two rows
> mysum[3] # the sum of the first three rows
> mysum[number] # the sum of the first "number" rows
You could use the by
function.
For instance, given the following data.frame:
d <- data.frame(Group.1=c(1,1,2,1,3,3,1,3),Group.2=c('Eggs'),x=1:8)
> d
Group.1 Group.2 x
1 1 Eggs 1
2 1 Eggs 2
3 2 Eggs 3
4 1 Eggs 4
5 3 Eggs 5
6 3 Eggs 6
7 1 Eggs 7
8 3 Eggs 8
You can do this:
num <- 3 # sum only the first 3 rows
# The aggregation function:
# it is called for each group receiving the
# data.frame subset as input and returns the aggregated row
innerFunc <- function(subDf){
# we create the aggregated row by taking the first row of the subset
row <- head(subDf,1)
# we set the x column in the result row to the sum of the first "num"
# elements of the subset
row$x <- sum(head(subDf$x,num))
return(row)
}
# Here we call the "by" function:
# it returns an object of class "by" that is a list of the resulting
# aggregated rows; we want to convert it to a data.frame, so we call
# rbind repeatedly by using "do.call(rbind, ... )"
d2 <- do.call(rbind,by(data=d,INDICES=d$Group.1,FUN=innerFunc))
> d2
Group.1 Group.2 x
1 1 Eggs 7
2 2 Eggs 3
3 3 Eggs 19