Why does tapply take the subset as NA and not exclude them totally

a 夏天 提交于 2019-11-28 12:49:42

问题


I have a question. I want to make a barplot with the mean and errorbars, where it is grouped for two factors. To get the mean and the standard errors I used the function tapply.

However for one of the factor I want to drop one level.

So what I did was did:

dataFE <- data[-which(plant=="FS"),] # this works fine, I get exactly the data set I want without the FS level of the factor plant 

Then to get the mean and standard error I use this:

means <- with(dataFE, as.matrix(tapply(leaves, list(plant, Orchestia), mean), nrow=2)

e <- with(dataFE, as.matrix(tapply (leaves, list(plant, Orchestia), function(x) sd(x)/sqrt(length(x))), nrow=2))

And there something strange happens, it does not calculate the FS, however it puts it in a table with NA:

    row.names   no          yes
1   F           7.009022    5.307185

2   FS          NA          NA

3   S           2.837139    2.111054

This I don't want, cause if I use this in barplot2 (package gplots) then I will get an empty bar for the FS, whereas that one should not be there at all.

So does any of use have a solution or an other method to get a nice barplot :). Thanks any way!


回答1:


Without a sample of your data, I'll just wager a guess:

your column plant is a factor. And while you have dropped the rows that have that value, the "level" FS still exists. Use levels(data$plant) to see. You can then use droplevels to get rid of it.

dat <- data.frame(x=1:15, y=factor(letters[1:3]))

> levels(dat$y)
[1] "a" "b" "c"

dat <- dat[dat$y != 'a',]
> levels(dat$y)
[1] "a" "b" "c"
> 

> tapply(dat$x, dat$y, sum)
 a  b  c 
NA 40 45 
> 

> droplevels(dat$y)
 [1] b c b c b c b c b c
Levels: b c
> dat$y <- droplevels(dat$y)

> tapply(dat$x, dat$y, sum)
 b  c 
40 45 
> 


来源:https://stackoverflow.com/questions/11632587/why-does-tapply-take-the-subset-as-na-and-not-exclude-them-totally

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!