I\'m trying to make this logistic regression graph in ggplot2
.
df <- structure(list(y = c(2L, 7L, 776L, 19L, 12L, 26L, 7L, 12L, 8L,
24L, 20L,
Modify your LD.summary
to include a new column with group
(or appropriate label).
LD.summary$group <- c('LD25','LD50','LD75')
Then modify your geom_segment
commands to have a col=LD.summary$group
in it (and remove the colour="red"
), which plots each segment in a different colour and adds a legend:
geom_segment( aes(...,col=LD.summary$group) )
Also, to avoid having to do the LD.summary$xxx
all the time, feed in data=LD.summary
to your geom_segment
:
geom_segment(data=LD.summary, aes(x=0, y=Pi,xend=LD, yend=Pi, colour=group) )
As to why the graphs are not exactly the same, in the base R graph the x axis goes from ~20 onwards, whereas in ggplot
it goes from zero onwards. This is because your second geom_segment
starts at x=0
.
To fix you could change x=0
to x=min(df$x)
.
To get your y axis label use + scale_y_continuous('Estimated probability')
.
In summary:
LD.summary$group <- c('LD25','LD50','LD75')
p <- ggplot(data = df, aes(x = x, y = y/n)) +
geom_point() +
stat_smooth(method = "glm", family = "binomial") +
scale_y_continuous('Estimated probability') # <-- add y label
p <- p + geom_segment(data=LD.summary, aes( # <-- data=Ld.summary
x = LD
, y = 0
, xend = LD
, yend = Pi
, col = group # <- colours
)
)
p <- p + geom_segment(data=LD.summary, aes( # <-- data=Ld.summary
x = min(df$x) # <-- don't plot all the way to x=0
, y = Pi
, xend = LD
, yend = Pi
, col = group # <- colours
)
)
print(p)
which yields:
Just a couple of minor additions to @mathetmatical.coffee's answer. Typically, geom_smooth
isn't supposed to replace actual modeling, which is why it can seem inconvenient at times when you want to use specific output you'd get from glm
and such. But really, all we need to do is add the fitted values to our data frame:
df$pred <- pi.hat
LD.summary$group <- c('LD25','LD50','LD75')
ggplot(df,aes(x = x, y = y/n)) +
geom_point() +
geom_line(aes(y = pred),colour = "black") +
geom_segment(data=LD.summary, aes(y = Pi,
xend = LD,
yend = Pi,
col = group),x = -Inf,linetype = "dashed") +
geom_segment(data=LD.summary,aes(x = LD,
xend = LD,
yend = Pi,
col = group),y = -Inf,linetype = "dashed")
The final little trick is the use of Inf
and -Inf
to get the dashed lines to extend all the way to the plot boundaries.
The lesson here is that if all you want to do is add a smooth to a plot, and nothing else in the plot depends on it, use geom_smooth
. If you want to refer to the output from the fitted model, its generally easier to fit the model outside ggplot
and then plot.