问题
I have a question regarding how to recreate the results from the envfit()
function in the vegan package.
Here is an example of envfit()
being used with an ordination and an environmental vector.
data(varespec)
data(varechem)
ord <- metaMDS(varespec)
chem.envfit <- envfit(ord, varechem, choices = c(1,2), permutations = 999)
chem.scores.envfit <- as.data.frame(scores(chem.envfit, display = "vectors"))
chem.scores.envfit
"The values that you see in the table are the standardised coefficients from the linear regression used to project the vectors into the ordination. These are directions for arrows of unit length." - comment from Plotted envfit vectors not matching NMDS scores
Also, from ?envfit
:
The printed output of continuous variables (vectors) gives the direction cosines which are the coordinates of the heads of unit length vectors. In plot these are scaled by their correlation (square root of the column r2) so that weak predictors have shorter arrows than strong predictors. You can see the scaled relative lengths using command scores.
Could someone please show me explicitly what linear model is being run, what standardized coefficients are being used, and where cosine is being applied to create these values?
回答1:
I probably shouldn't have said "standardised" in that answer.
For each column (variable) in varechem
and the first two axes of the ordination (choices = 1:2
), the linear model is:
\hat(env_j) = \beta_1 * scr1 + \beta_2 * scr2
where env_j
is the $j$th variable in varechem
, scr1
and scr2
are the axis scores on the first and second axis being considered (i.e. the plane defined by choices = 1:2
, but this extends to higher dimensions), and the \beta
are the regression coefficients for the pair of axis scores.
There's no intercept in this model as we (weighted) centre all the variables in varechem
and the axis scores, with weights really only concerning CCA, capscale()
, and DCA methods as those are weighted models themselves.
The heads of the arrows in the space spanned by the axis scores are the coefficients of that model — we actually normalise (which I misrepresented as "standardised" in that other reply) so that the arrows have unit length. These values (the NMDS1
and NMDS2
columns in the envfit
output) are direction cosines in the sense of https://en.wikipedia.org/wiki/Direction_cosine.
Here's a simplified walk through of what we do where there are no weights involved and all the variables in env
are numeric, as in your example. (Note we don't actually do it this way for efficiency reasons: see the code behind vectorfit()
for the QR decomposition used if you really want the details.)
## extract the axis scores for the axes we want, 1 and 2
scrs <- scores(ord, choices = c(1,2))
## centre the scores (note not standardising them)
scrs <- as.data.frame(scale(scrs, scale = FALSE, center = TRUE))
## centre the environmental variables - keep as matrix
env <- scale(varechem, scale = FALSE, center = TRUE)
## fit the linear models with no intercept
mod <- lm(env ~ NMDS1 + NMDS2 - 1, data = scrs)
## extract the coefficients from the models
betas <- coef(mod)
## normalize coefs to unit length
## i.e. betas for a particular env var have sum of squares = 1
t(sweep(betas, 2L, sqrt(colSums(betas^2)), "/"))
The last line gives:
> t(sweep(betas, 2L, sqrt(colSums(betas^2)), "/"))
NMDS1 NMDS2
N -0.05731557 -0.9983561
P 0.61972792 0.7848167
K 0.76646744 0.6422832
Ca 0.68520442 0.7283508
Mg 0.63252973 0.7745361
S 0.19139498 0.9815131
Al -0.87159427 0.4902279
Fe -0.93600826 0.3519780
Mn 0.79870870 -0.6017179
Zn 0.61755690 0.7865262
Mo -0.90308490 0.4294621
Baresoil 0.92487118 -0.3802806
Humdepth 0.93282052 -0.3603413
pH -0.64797447 0.7616621
which replicates (except for showing more signif figures) the values returned by envfit()
in this case:
> chem.envfit
***VECTORS
NMDS1 NMDS2 r2 Pr(>r)
N -0.05732 -0.99836 0.2536 0.045 *
P 0.61973 0.78482 0.1938 0.099 .
K 0.76647 0.64228 0.1809 0.095 .
Ca 0.68520 0.72835 0.4119 0.006 **
Mg 0.63253 0.77454 0.4270 0.003 **
S 0.19139 0.98151 0.1752 0.109
Al -0.87159 0.49023 0.5269 0.002 **
Fe -0.93601 0.35198 0.4450 0.002 **
Mn 0.79871 -0.60172 0.5231 0.002 **
Zn 0.61756 0.78653 0.1879 0.100 .
Mo -0.90308 0.42946 0.0609 0.545
Baresoil 0.92487 -0.38028 0.2508 0.061 .
Humdepth 0.93282 -0.36034 0.5201 0.001 ***
pH -0.64797 0.76166 0.2308 0.067 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Permutation: free
Number of permutations: 999
来源:https://stackoverflow.com/questions/60953996/how-are-envfit-results-created