I tried searching for this but could not find the info. I am conducting a linear regression using 10 variables (1 y variable and 9 x variables). All the variables are correl
So it sounds like you are facing a model selection problem, you want to choose the best variables without overfitting correct?
PCA may not be the way to go for feature selection, here's one discussion of it:
https://stats.stackexchange.com/questions/27300/using-pca-for-feature-selection
The usual purpose of PCA is dimensionality reduction, i.e. describing relationships in your data using fewer dimensions than are actually present. A component that explains a lot of variance could be a good feature but not necessarily, its not exactly geared towards that purpose.
If what you want to do is pare down the number of features in your model, I would suggest using an information criterion like the AIC. You can easily use this is R with the stepAIC
function like so:
library(MASS)
fit = lm(Sepal.Length ~ .^2,data=iris)
step <- stepAIC(fit, direction="backward")
step$anova
>> Stepwise Model Path
>> Analysis of Deviance Table
>>
>> Initial Model:
>> Sepal.Length ~ (Sepal.Width + Petal.Length + Petal.Width + Species)^2
>>
>> Final Model:
>> Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species +
>> Sepal.Width:Petal.Width + Petal.Length:Species + Petal.Width:Species
At each step it trims out another feature, minimizing on AIC. There is a lot more that goes into model selection, and a lot of things to consider and adjust, so this is not a proscriptive guide, just wanted to bring it up as something to consider.