问题
UPDATE: I found the answer... included it below.
I have a dataset that contains the following variables and similar values:
COBSDATE, CITY, RESPONSE_TIME
2011-11-23 A 1.1
2011-11-23 A 1.5
2011-11-23 A 1.2
2011-11-23 B 2.3
2011-11-23 B 2.1
2011-11-23 B 1.8
2011-11-23 C 1.4
2011-11-23 C 6.1
2011-11-23 A 3.1
2011-11-23 A 1.1
I have successfully created a graph that displays all of the response_time values and a smooth geometry to further describe some of the variation.
The challenge that I have is that I want a better view of the smoothed value, and one of the cities has frequent 'outliers'. I can control this by adding ylim(0,p99) to the plot, but this then causes the smooth to only be calculated on the subset of data.
Is there a way to use all of this data for the smoothed plot and the only the subset for the jitter plot?
My code here (both are the same except for the + ylim(0,20)
:
truncated -
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) +
geom_jitter(colour=alpha("#007DB1", 1/8)) +
geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) +
ylim(0,20) +
facet_wrap(~CITY)
Whole data set -
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) +
geom_jitter(colour=alpha("#007DB1", 1/8)) +
geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) +
facet_wrap(~CITY)
回答1:
If you just want to "zoom in", you can use coord_cartesian
:
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) +
geom_jitter(colour=alpha("#007DB1", 1/8)) +
geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) +
coord_cartesian(ylim=c(0,20)) +
facet_wrap(~CITY)
If you want to use a subset of the data for the jitter geom, then override the data inheritance:
ggplot(dataRaw, aes(x=COBSDATE, y=RESPONSE_TIME)) +
geom_jitter(data=subset(dataRaw, RESPONSE_TIME>=0 & RESPONSE_TIME<=20),
colour=alpha("#007DB1", 1/8)) +
geom_smooth(colour="gray30", fill=alpha("gray40",0.5)) +
ylim(0,20) +
facet_wrap(~CITY)
回答2:
UPDATED ANSWER:So, I was looking for something completely different and stumbled upon the answer I needed.
Instead of ylim(0,yMax)
One should use coord_cartesian(ylim = c(0, yMax))
It appears that coord_cartesian
simply "zooms" the graph instead of truncating the data included.
来源:https://stackoverflow.com/questions/9505270/r-ggplot2-smooth-on-entire-dataset-while-enforcing-a-ylim-cap