How to create factors from factanal?

前端 未结 4 1863
旧巷少年郎
旧巷少年郎 2021-02-03 11:12

When performing a factor analysis using factanal the usual result is some loadings table plus several other information. Is there a direct way to use these loadings to create a

相关标签:
4条回答
  • 2021-02-03 11:40

    Do you not want the loadings component?

    loadings(fa)
    

    See ?loadings and ?factanal to check that it is loadings you want. I find the terminology used so confusing at times, what with loadings, scores, ...

    0 讨论(0)
  • 2021-02-03 11:46

    A similar question was asked on Psych SE.

    There, I provide a function in case you want to generate factor scores for new data.


    I wrote the following function that takes the fit object returned by factanal and new data that you provide (e.g., a data frame or matrix with identical variable names).

    score_new_data <- function(fit, data) {
        z <- as.matrix(scale(data[,row.names(fit$correlation)]))
        z %*% solve(fit$correlation, fit$loadings)
    }
    

    So for example,

    bfi <- na.omit(bfi)
    variables <- c("A1", "A2", "A3", "A4", "C1", "C2", "C3", "C4")
    data <- bfi[,variables]
    fit <- factanal(data, factors = 2, scores = "regression", rotation = "varimax")
    

    This is a typical factor analysis.

    And now supply some new data along with the fit of the factor analysis:

    score_new_data(fit, data[1:5, ])
    

    And it generates the following:

    > score_new_data(fit, data[1:5, ])
             Factor1    Factor2
    61623  1.5022427  0.5457393
    61629 -0.6817812 -0.9755466
    61634 -0.2901822  0.1051234
    61640  0.5429929 -0.4955180
    61661 -1.0732722  0.8202019
    
    0 讨论(0)
  • 2021-02-03 11:51

    I haven't checked it manually, but here´s a way do it:

    fa <-  factanal(mydf,3,rotation="varimax",scores="regression")
    fa$scores
    

    HTH someone else. Suggestions, corrections, improvements welcome!

    0 讨论(0)
  • 2021-02-03 11:57

    You asked how to use the loadings for construction of scores. Your solution is, although correct, not doing that. It's using a regression method (alternatively you can use Bartlett's method as well), and this uses the restriction that the scores are uncorrelated, centered around 0 and with variance = 1. These are hence not the same factors as one would obtain by using F = ML with F the factor matrix, M the original matrix and L the loading matrix.

    A demonstration with the example from the help files :

    v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
    v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
    v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
    v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
    v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
    v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
    m1 <- cbind(v1,v2,v3,v4,v5,v6)
    
    fa <- factanal(m1, factors=3,scores="regression")
    
    fa$scores # the correct solution
    
    fac <- m1 %*% loadings(fa) # the answer on your question
    

    These are clearly different values.

    Edit : This has to do with the fact that the Thomson regression scores are based on scaled variables, and take the correlation matrix into account. If you would calculate the scores by hand, you'd do :

    > fac2 <- scale(m1) %*% solve(cor(m1)) %*% loadings(fa)
    > all.equal(fa$scores,as.matrix(fac2))
    [1] TRUE
    

    For more information, see this review

    And to show you why it is important : If you calculate the scores the "naive" way, your scores are actually correlated. And that is what you wanted to get rid of in the first place :

    > round(cor(fac),2)
            Factor1 Factor2 Factor3
    Factor1    1.00    0.79    0.81
    Factor2    0.79    1.00    0.82
    Factor3    0.81    0.82    1.00
    
    > round(cor(fac2),2)
            Factor1 Factor2 Factor3
    Factor1       1       0       0
    Factor2       0       1       0
    Factor3       0       0       1
    
    0 讨论(0)
提交回复
热议问题