问题
I have an usual problem when using geom_errorbar in ggplot2.
The error bars are not within range but that is of no concern here.
My problem is that geom_errorbar is plotting the confidence intervals for the same data differently depending on what other data is plotted with it.
The code below filters the data only passing data where Audio1 is equal to "300SW" OR "3500MFL" in the uncommented SE and AggBar.
SE<-c(0.0861829641865964, 0.0296894376485468, 0.0323219002250762,
0.0937013798013447)
AggBar <- structure(list(Report = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L), .Label = c("One Flash", "Two Flashes"), class = "factor"),
Visual = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("one",
"two"), class = "factor"), Audio = c("300SW", "300SW", "300SW",
"300SW", "3500MFL3500CL", "3500MFL3500CL", "3500MFL3500CL",
"3500MFL3500CL"), Prob = c(0.938828282828283, 0.0611717171717172,
0.754141414141414, 0.245858585858586, 0.534484848484848,
0.465515151515151, 0.0830909090909091, 0.916909090909091)), .Names = c("Report",
"Visual", "Audio", "Prob"), row.names = c(NA, -8L), class = "data.frame")
#SE<-c(0.0310069159026252, 0.113219880555153, 0.0861829641865964, 0.0296894376485468)
#AggBar <- structure(list(Report = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
#2L), .Label = c("One Flash", "Two Flashes"), class = "factor"),
#Visual = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("one",
#"two"), class = "factor"), Audio = c("300MFL300CL", "300MFL300CL",
#"300MFL300CL", "300MFL300CL", "300SW", "300SW", "300SW",
#"300SW"), Prob = c(0.562242424242424, 0.437757575757576,
#0.0921010101010101, 0.90789898989899, 0.938828282828283,
#0.0611717171717172, 0.754141414141414, 0.245858585858586)), .Names = c("Report",
#"Visual", "Audio", "Prob"), row.names = c(NA, -8L), class = "data.frame")
prob.bar = ggplot(AggBar, aes(x = Report, y = Prob, fill = Report)) + theme_bw() #+ facet_grid(Audio~Visual)
prob.bar + #This changes all panels' colour
geom_bar(position=position_dodge(.9), stat="identity", colour="black", width=0.8)+
theme(legend.position = "none") + labs(x="Report", y="Probability of Report", title = expression("Visual Condition")) + scale_fill_grey() +
scale_fill_grey(start=.4) +
scale_y_continuous(limits = c(0, 1), breaks = (seq(0,1,by = .25)))+
facet_grid(Audio ~ Visual)+
geom_errorbar(aes(ymin=Prob-SE, ymax=Prob+SE),
width=.1, # Width of the error bars
position=position_dodge(.09))
This results in the following output:
The Audio1 variables are seen on the rightmost vertical labels.
However if I filter where it only passes where Audio1 is equal to "300SW" OR "300MFL" (the commented SE and AggBar) the error bars for "300SW change":
The Audio1 variables are seen on the rightmost vertical labels with "300SW" on the bottom this time.
This change is the incorrect one because when I plot just the Audio1 "300SW" the error bars match the original plot.
I have tried plotting the Audio1 "300SW" with other variables not presented here and it is only when presenting with "300MFL" that this change occurs.
If you look at the SE variable contents you will see that there is no change in the values therein for "300SW" in both versions of the code. Yet the outputs differ.
I cannot fathom what is happening here. Any ideas or suggestions are welcome.
Thanks very much for your time.
@Antonios K below has highlighted that when "300SW" is on top of the grid the error bars are correctly drawn. I'm guessing that the error bars are being incorrectly matched to the bars although I don't know why this is the case.
回答1:
The problem is that SE
is not stored inside the data frame: it's just floating around in the global environment. When the data is facetted (which involves rearranging the order), it no longer lines up with the correct records. Fix the problem by storing SE
in the data frame:
AggBar$SE <- c(0.0310069159026252, 0.113219880555153, 0.0861829641865964, 0.0296894376485468)
ggplot(AggBar, aes(Report, Prob, Report)) +
geom_bar(stat = "identity", fill = "grey50") +
geom_errorbar(aes(ymin = Prob - SE, ymax = Prob + SE), width = 0.4) +
facet_grid(Audio ~ Visual)
回答2:
The bit of code that plots the error bars is :
geom_errorbar(aes(ymin=Prob-SE, ymax=Prob+SE),
width=.1, # Width of the error bars
position=position_dodge(.09))
So, I guess it's something there. As you said the SE variable is the same in both cases, but what you plot there is Prob-SE and Prob+SE. And if you do AggBar$Prob-SE and AggBar$Prob+SE you'll get different values for 300SW for each case.
Might have to do with the order of your Audio1 values. The other cases that worked did they have 300SW on the top part of the plots as well maybe?
Try
sort(unique(DataRearrange$Audio1) )
[1] "300MFL" "300SW" "3500MFL"
Combining first two will give you 300SW on the bottom part of the plots. Combining last two will give you 300SW on the top part.
So, to check this assumption, in your second case when you combine 300MFL and 300SW try to replace 300SW with 1_300SW (so that 300SW will be plotted on top) and see what happens. Just do :
DataRearrange$Audio1[DataRearrange$Audio1=="300SW"] = "1_300SW"
# Below is the alternative coupling..
ErrorBarsDF <- DataRearrange[(DataRearrange$Audio1=="1_300SW" | DataRearrange$Audio1=="300MFL"), c("correct","Visual1", "Audio1", "Audio2","correct_response", "response", "subject_nr")]
DataRearrange <- DataRearrange[(DataRearrange$Audio1=="1_300SW" | DataRearrange$Audio1=="300MFL"), c("correct","Visual1", "Audio1", "Audio2","correct_response", "response", "subject_nr")]
来源:https://stackoverflow.com/questions/31842762/geom-errorbar-behaving-strangely-ggplot2