geom_errorbar behaving strangely, ggplot2

女生的网名这么多〃 提交于 2019-12-13 07:44:49

问题


I have an usual problem when using geom_errorbar in ggplot2.

The error bars are not within range but that is of no concern here.

My problem is that geom_errorbar is plotting the confidence intervals for the same data differently depending on what other data is plotted with it.

The code below filters the data only passing data where Audio1 is equal to "300SW" OR "3500MFL" in the uncommented SE and AggBar.

SE<-c(0.0861829641865964, 0.0296894376485468, 0.0323219002250762, 
  0.0937013798013447)

AggBar <- structure(list(Report = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 
                                          2L), .Label = c("One Flash", "Two Flashes"), class = "factor"), 
                     Visual = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("one", 
                                                                                      "two"), class = "factor"), Audio = c("300SW", "300SW", "300SW", 
                                                                                                                           "300SW", "3500MFL3500CL", "3500MFL3500CL", "3500MFL3500CL", 
                                                                                                                           "3500MFL3500CL"), Prob = c(0.938828282828283, 0.0611717171717172, 
                                                                                                                                                      0.754141414141414, 0.245858585858586, 0.534484848484848, 
                                                                                                                                                      0.465515151515151, 0.0830909090909091, 0.916909090909091)), .Names = c("Report",
                                                                                                                                                                                                                             "Visual", "Audio", "Prob"), row.names = c(NA, -8L), class = "data.frame")



  #SE<-c(0.0310069159026252, 0.113219880555153, 0.0861829641865964, 0.0296894376485468)

  #AggBar <- structure(list(Report = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 
                                #2L), .Label = c("One Flash", "Two Flashes"), class = "factor"), 
           #Visual = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("one", 
                                                                            #"two"), class = "factor"), Audio = c("300MFL300CL", "300MFL300CL", 
                                                                                                                 #"300MFL300CL", "300MFL300CL", "300SW", "300SW", "300SW", 
                                                                                                                 #"300SW"), Prob = c(0.562242424242424, 0.437757575757576, 
                                                                                                                                    #0.0921010101010101, 0.90789898989899, 0.938828282828283, 
                                                                                                                                    #0.0611717171717172, 0.754141414141414, 0.245858585858586)), .Names = c("Report", 
                                                                                                                                                                                                           #"Visual", "Audio", "Prob"), row.names = c(NA, -8L), class = "data.frame")






prob.bar = ggplot(AggBar, aes(x = Report, y = Prob, fill = Report)) + theme_bw() #+ facet_grid(Audio~Visual)
prob.bar + #This changes all panels' colour
geom_bar(position=position_dodge(.9), stat="identity", colour="black", width=0.8)+
theme(legend.position = "none") + labs(x="Report", y="Probability of Report", title = expression("Visual Condition")) + scale_fill_grey() +
scale_fill_grey(start=.4) + 
scale_y_continuous(limits = c(0, 1), breaks = (seq(0,1,by = .25)))+
facet_grid(Audio ~ Visual)+
geom_errorbar(aes(ymin=Prob-SE, ymax=Prob+SE),
          width=.1, # Width of the error bars
          position=position_dodge(.09))

This results in the following output:

The Audio1 variables are seen on the rightmost vertical labels.

However if I filter where it only passes where Audio1 is equal to "300SW" OR "300MFL" (the commented SE and AggBar) the error bars for "300SW change":

The Audio1 variables are seen on the rightmost vertical labels with "300SW" on the bottom this time.

This change is the incorrect one because when I plot just the Audio1 "300SW" the error bars match the original plot.

I have tried plotting the Audio1 "300SW" with other variables not presented here and it is only when presenting with "300MFL" that this change occurs.

If you look at the SE variable contents you will see that there is no change in the values therein for "300SW" in both versions of the code. Yet the outputs differ.

I cannot fathom what is happening here. Any ideas or suggestions are welcome.

Thanks very much for your time.

@Antonios K below has highlighted that when "300SW" is on top of the grid the error bars are correctly drawn. I'm guessing that the error bars are being incorrectly matched to the bars although I don't know why this is the case.


回答1:


The problem is that SE is not stored inside the data frame: it's just floating around in the global environment. When the data is facetted (which involves rearranging the order), it no longer lines up with the correct records. Fix the problem by storing SE in the data frame:

AggBar$SE <- c(0.0310069159026252, 0.113219880555153, 0.0861829641865964, 0.0296894376485468)

ggplot(AggBar, aes(Report, Prob, Report)) +
  geom_bar(stat = "identity", fill = "grey50") +
  geom_errorbar(aes(ymin = Prob - SE, ymax = Prob + SE), width = 0.4) + 
  facet_grid(Audio ~ Visual)



回答2:


The bit of code that plots the error bars is :

geom_errorbar(aes(ymin=Prob-SE, ymax=Prob+SE), width=.1, # Width of the error bars position=position_dodge(.09))

So, I guess it's something there. As you said the SE variable is the same in both cases, but what you plot there is Prob-SE and Prob+SE. And if you do AggBar$Prob-SE and AggBar$Prob+SE you'll get different values for 300SW for each case.

Might have to do with the order of your Audio1 values. The other cases that worked did they have 300SW on the top part of the plots as well maybe?

Try

sort(unique(DataRearrange$Audio1) )

[1] "300MFL"  "300SW"   "3500MFL"

Combining first two will give you 300SW on the bottom part of the plots. Combining last two will give you 300SW on the top part.

So, to check this assumption, in your second case when you combine 300MFL and 300SW try to replace 300SW with 1_300SW (so that 300SW will be plotted on top) and see what happens. Just do :

    DataRearrange$Audio1[DataRearrange$Audio1=="300SW"] = "1_300SW"

# Below is the alternative coupling..

ErrorBarsDF <- DataRearrange[(DataRearrange$Audio1=="1_300SW" | DataRearrange$Audio1=="300MFL"), c("correct","Visual1", "Audio1", "Audio2","correct_response", "response", "subject_nr")]
DataRearrange <- DataRearrange[(DataRearrange$Audio1=="1_300SW" | DataRearrange$Audio1=="300MFL"), c("correct","Visual1", "Audio1", "Audio2","correct_response", "response", "subject_nr")]


来源:https://stackoverflow.com/questions/31842762/geom-errorbar-behaving-strangely-ggplot2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!