Calculate mean and sd by ID and Day within a column

风格不统一 提交于 2019-12-24 20:25:56

问题


I'm new to R and need to summarize mean and sd of my data in a new table:

The raw data look like that:

ID    Day    pH
1      1      7
1      1      7.2
1      1      7.1
2      1      7.3
2      1      7.4
2      1      7.2
3      1      7
3      1      7.1
3      1      7.5
4      1      7.3
4      1      7.2
4      1      7.6
1      2      7
1      2      7.2
1      2      7.1
2      2      7.1
2      2      7.4
2      2      7.2
3      2      7.5
3      2      7.1
3      2      7.5
4      2      7.2
4      2      7.2
4      2      7.3
1      3      7.4
1      3      7.2
1      3      7.1
2      3      7.2
2      3      7.4
2      3      7.2
3      3      7.4
3      3      7.2
3      3      7.5
4      3      7.4
4      3      7.2
4      3      7.7

And the table I want should look like:

ID    Day    pHmean   pHsd
1      1      7.1      0.10
2      1      7.3      0.10
3      1      7.2      0.26
4      1      7.4      0.21
1      2      7.1      0.10
2      2      7.2      0.15
3      2      7.4      0.23
4      2      7.2      0.06
1      3      7.2      0.15
2      3      7.3      0.12
3      3      7.4      0.15
4      3      7.4      0.25

And then I want to create an barplot with error bars showing the pH value on the y-achsis and the ID on the x-achsis with the days in different coloured bars.

Hope someone can help me!


回答1:


Posted as a solution as there was some discussion of if this works (could be R version or something):

aggregate(pH~ID+Day, dat, function(x) round(c(mean=mean(x), sd=sd(x)), 2))

## > aggregate(pH~ID+Day, dat, function(x) round(c(mean=mean(x), sd=sd(x)), 2))
##    ID Day pH.mean pH.sd
## 1   1   1    7.10  0.10
## 2   2   1    7.30  0.10
## 3   3   1    7.20  0.26
## 4   4   1    7.37  0.21
## 5   1   2    7.10  0.10
## 6   2   2    7.23  0.15
## 7   3   2    7.37  0.23
## 8   4   2    7.23  0.06
## 9   1   3    7.23  0.15
## 10  2   3    7.27  0.12
## 11  3   3    7.37  0.15
## 12  4   3    7.43  0.25



回答2:


I suggest using the aggregate function like so...

pHmean <- aggregate( pH ~ Day + ID , data = dat , FUN = mean )[,3]

dat <- cbind( aggregate( pH ~ Day + ID , data = dat , FUN = sd ) , pHmean )
dat
   Day ID         pH   pHmean
1    1  1 0.10000000 7.100000
2    2  1 0.10000000 7.100000
3    3  1 0.15275252 7.233333
4    1  2 0.10000000 7.300000
5    2  2 0.15275252 7.233333
6    3  2 0.11547005 7.266667
7    1  3 0.26457513 7.200000
8    2  3 0.23094011 7.366667
9    3  3 0.15275252 7.366667
10   1  4 0.20816660 7.366667
11   2  4 0.05773503 7.233333
12   3  4         NA 7.400000



回答3:


For the values you can use the package plyr:

x
ID    Day    pH
1      1      7
1      1      7.2
1      1      7.1
2      1      7.3
2      1      7.4
2      1      7.2
3      1      7
3      1      7.1
3      1      7.5
4      1      7.3
4      1      7.2
4      1      7.6
1      2      7
1      2      7.2
1      2      7.1
2      2      7.1
2      2      7.4
2      2      7.2
3      2      7.5
3      2      7.1
3      2      7.5
4      2      7.2
4      2      7.2
4      2      7.3
1      3      7.4
1      3      7.2
1      3      7.1
2      3      7.2
2      3      7.4
2      3      7.2
3      3      7.4
3      3      7.2
3      3      7.5
4      3      7.4
4      3      7.2
4      3      7.7
require(plyr)
d1 <- ddply(x, .(ID, Day), summarize, phMean=mean(pH), pHsd=sd(pH))
d2 <- reshape(d1, v.names=c("phMean", "pHsd"), idvar="ID",     timevar="Day",direction="wide")
rownames(d2) <- d2[,1]
d2 <- t(d2[,-1])

require(gplots)
barplot2(d2[(1:nrow(d2))%%2>0.5,], beside=T, plot.ci=T, 
ci.l=d2[(1:nrow(d2))%%2>0.5,]-d2[(1:nrow(d2))%%2<0.5,],
ci.u=d2[(1:nrow(d2))%%2>0.5,]+d2[(1:nrow(d2))%%2<0.5])



回答4:


Thanks for all your comments and answers, it was very usefull!!

And I already managed to make the graph using:

barplot(matrix(c(Rtest.dat$pH.mean),nr=3), beside=T, col=c("black","grey","white"), main="pH", names.arg=c("Green", "Yellow", "Blue", "Red"), ylab="pH") legend("topright", c("Day 1","Day 2","Day 3"), cex=0.6, bty="n", fill=c("black","grey","white"))

but I got stuck and have no clue how to add the error bars?!? I was looking online but couldn't figure it out. Hope you can help me!!



来源:https://stackoverflow.com/questions/15927193/calculate-mean-and-sd-by-id-and-day-within-a-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!