问题
I'm new to R and need to summarize mean and sd of my data in a new table:
The raw data look like that:
ID Day pH
1 1 7
1 1 7.2
1 1 7.1
2 1 7.3
2 1 7.4
2 1 7.2
3 1 7
3 1 7.1
3 1 7.5
4 1 7.3
4 1 7.2
4 1 7.6
1 2 7
1 2 7.2
1 2 7.1
2 2 7.1
2 2 7.4
2 2 7.2
3 2 7.5
3 2 7.1
3 2 7.5
4 2 7.2
4 2 7.2
4 2 7.3
1 3 7.4
1 3 7.2
1 3 7.1
2 3 7.2
2 3 7.4
2 3 7.2
3 3 7.4
3 3 7.2
3 3 7.5
4 3 7.4
4 3 7.2
4 3 7.7
And the table I want should look like:
ID Day pHmean pHsd
1 1 7.1 0.10
2 1 7.3 0.10
3 1 7.2 0.26
4 1 7.4 0.21
1 2 7.1 0.10
2 2 7.2 0.15
3 2 7.4 0.23
4 2 7.2 0.06
1 3 7.2 0.15
2 3 7.3 0.12
3 3 7.4 0.15
4 3 7.4 0.25
And then I want to create an barplot with error bars showing the pH value on the y-achsis and the ID on the x-achsis with the days in different coloured bars.
Hope someone can help me!
回答1:
Posted as a solution as there was some discussion of if this works (could be R version or something):
aggregate(pH~ID+Day, dat, function(x) round(c(mean=mean(x), sd=sd(x)), 2))
## > aggregate(pH~ID+Day, dat, function(x) round(c(mean=mean(x), sd=sd(x)), 2))
## ID Day pH.mean pH.sd
## 1 1 1 7.10 0.10
## 2 2 1 7.30 0.10
## 3 3 1 7.20 0.26
## 4 4 1 7.37 0.21
## 5 1 2 7.10 0.10
## 6 2 2 7.23 0.15
## 7 3 2 7.37 0.23
## 8 4 2 7.23 0.06
## 9 1 3 7.23 0.15
## 10 2 3 7.27 0.12
## 11 3 3 7.37 0.15
## 12 4 3 7.43 0.25
回答2:
I suggest using the aggregate
function like so...
pHmean <- aggregate( pH ~ Day + ID , data = dat , FUN = mean )[,3]
dat <- cbind( aggregate( pH ~ Day + ID , data = dat , FUN = sd ) , pHmean )
dat
Day ID pH pHmean
1 1 1 0.10000000 7.100000
2 2 1 0.10000000 7.100000
3 3 1 0.15275252 7.233333
4 1 2 0.10000000 7.300000
5 2 2 0.15275252 7.233333
6 3 2 0.11547005 7.266667
7 1 3 0.26457513 7.200000
8 2 3 0.23094011 7.366667
9 3 3 0.15275252 7.366667
10 1 4 0.20816660 7.366667
11 2 4 0.05773503 7.233333
12 3 4 NA 7.400000
回答3:
For the values you can use the package plyr:
x
ID Day pH
1 1 7
1 1 7.2
1 1 7.1
2 1 7.3
2 1 7.4
2 1 7.2
3 1 7
3 1 7.1
3 1 7.5
4 1 7.3
4 1 7.2
4 1 7.6
1 2 7
1 2 7.2
1 2 7.1
2 2 7.1
2 2 7.4
2 2 7.2
3 2 7.5
3 2 7.1
3 2 7.5
4 2 7.2
4 2 7.2
4 2 7.3
1 3 7.4
1 3 7.2
1 3 7.1
2 3 7.2
2 3 7.4
2 3 7.2
3 3 7.4
3 3 7.2
3 3 7.5
4 3 7.4
4 3 7.2
4 3 7.7
require(plyr)
d1 <- ddply(x, .(ID, Day), summarize, phMean=mean(pH), pHsd=sd(pH))
d2 <- reshape(d1, v.names=c("phMean", "pHsd"), idvar="ID", timevar="Day",direction="wide")
rownames(d2) <- d2[,1]
d2 <- t(d2[,-1])
require(gplots)
barplot2(d2[(1:nrow(d2))%%2>0.5,], beside=T, plot.ci=T,
ci.l=d2[(1:nrow(d2))%%2>0.5,]-d2[(1:nrow(d2))%%2<0.5,],
ci.u=d2[(1:nrow(d2))%%2>0.5,]+d2[(1:nrow(d2))%%2<0.5])
回答4:
Thanks for all your comments and answers, it was very usefull!!
And I already managed to make the graph using:
barplot(matrix(c(Rtest.dat$pH.mean),nr=3), beside=T, col=c("black","grey","white"), main="pH", names.arg=c("Green", "Yellow", "Blue", "Red"), ylab="pH") legend("topright", c("Day 1","Day 2","Day 3"), cex=0.6, bty="n", fill=c("black","grey","white"))
but I got stuck and have no clue how to add the error bars?!? I was looking online but couldn't figure it out. Hope you can help me!!
来源:https://stackoverflow.com/questions/15927193/calculate-mean-and-sd-by-id-and-day-within-a-column