I'm searching for a method to add the predicted (real, not standardized) values of every single variable in my model
> model<-gam(LN_Brutto~s(agecont,by=Sex)+factor(Sex)+te(Month,Age)+s(Month,by=Sex),
data=bears)
This is the summary of my model:
> summary(m13)
Family: gaussian
Link function: identity
Formula:
LN_Brutto ~ s(agecont, by = Sex) + factor(Sex) + te(Month, Age) +
s(Month, by = Sex)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.32057 0.01071 403.34 <2e-16 ***
factor(Sex)m 0.27708 0.01376 20.14 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(agecont):Sexf 8.1611 8.7526 20.170 < 2e-16 ***
s(agecont):Sexm 6.6695 7.5523 32.689 < 2e-16 ***
te(Month,Age) 10.3651 12.7201 6.784 2.19e-12 ***
s(Month):Sexf 0.9701 0.9701 0.641 0.430
s(Month):Sexm 1.3750 1.6855 0.193 0.787
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Rank: 60/62
R-sq.(adj) = 0.781 Deviance explained = 78.7%
GCV = 0.048221 Scale est. = 0.046918 n = 1093
predicted values are provided by this code:
> predict<-predict(m13, type = "terms")
And the result looks like this:
factor(Sex) s(agecont):Sexf s(agecont):Sexm te(Month,Age) s(Month):Sexf s(Month):Sexm
1 0.2770806 0.000000000 0.111763696 -0.077845764 0.000000000 0.0007840912
2 0.2770806 0.000000000 0.240016156 -0.049143798 0.000000000 0.0007840912
3 0.2770806 0.000000000 0.034328752 0.046524454 0.000000000 -0.0058871897
4 0.0000000 -0.786533918 0.000000000 -0.067942427 0.021990192 0.0000000000
5 0.0000000 0.074434715 0.000000000 0.046524454 0.021990192 0.0000000000
6 0.0000000 0.161121563 0.000000000 0.089599601 0.021990192 0.0000000000
7 0.0000000 0.074434715 0.000000000 0.046524454 0.021990192 0.0000000000
8 0.2770806 0.000000000 -0.298597370 -0.007877328 0.000000000 -0.0058871897
...
But I guess these are just standardized predicted values and not the real values (the real ones should have no negative values!?).
So does anyone know what I have to modify in the code, to get the real values? Any idea? Thank you!
Not quite sure if I follow you correctly,
but predict(model, type = "terms")
might be the solution you're looking for.
Update
I don't think these are standardised. Possibly some of the coefficients are just negative.
Consider the example from the help file ?mgcv:::predict.gam
:
library(mgcv)
n<-200
sig <- 2
dat <- gamSim(1,n=n,scale=sig)
b<-gam(y~s(x0)+s(I(x1^2))+s(x2)+offset(x3),data=dat)
The results below illustrate that these are in fact the contributions that are being used for each predictor to calculate the fitted values (by calculating the sum of each of these contributions and then adding the intercept and the offset).
> head(predict(b))
1 2 3 4 5 6
9.263322 2.822200 7.137201 4.902631 14.558401 11.889092
> head(rowSums(predict(b, type = "terms")) + attr(predict(b, type = "terms"), "constant") + dat$x3)
1 2 3 4 5 6
9.263322 2.822200 7.137201 4.902631 14.558401 11.889092
To return predicted values on the same scale of the response you need to set predict(model, type = "response")
The default behaviour of the gam is type = "link" which returns the linear predictor and often with standard errors (thus the positive and negative values you found).
Read more on the ?mgcv::predict.gam
help page.
来源:https://stackoverflow.com/questions/27017133/how-to-extract-fitted-values-of-gam-mgcv-for-each-variable-in-r