I would like to use geom_smooth
to get a fitted line from a certain linear regression model.
It seems to me that the formula can only take x
a
This is a very interesting question. Probably the main reason why geom_smooth
is so "resistant" to allowing custom models of multiple variables is that it is limited to producing 2-D curves; consequently, its arguments are designed for handling two-dimensional data (i.e. formula = response variable ~ independent variable).
The trick to getting what you requested is using the mapping
argument within geom_smooth
, instead of formula
. As you've probably seen from looking at the documentation, formula
only allows you to specify the mathematical structure of the model (e.g. linear, quadratic, etc.). Conversely, the mapping
argument allows you to directly specify new y-values - such as the output of a custom linear model that you can call using predict()
.
Note that, by default, inherit.aes
is set to TRUE
, so your plotted regressions will be coloured appropriately by your categorical variable. Here's the code:
# original plot
plot1 <- ggplot(df, aes(x=pred, y=outcome, color=factor)) +
geom_point(aes(color=factor)) +
geom_smooth(method = "lm") +
ggtitle("outcome ~ pred") +
theme_bw()
# declare new model here
plm <- lm(formula = outcome ~ pred + factor, data=df)
# plot with lm for outcome ~ pred + factor
plot2 <-ggplot(df, aes(x=pred, y=outcome, color=factor)) +
geom_point(aes(color=factor)) +
geom_smooth(method = "lm", mapping=aes(y=predict(plm,df))) +
ggtitle("outcome ~ pred + factor") +
theme_bw()