I have a set of observations with 23 variables.
When I use prcomp and biplot to plot the results I run into several problems:
the actual plot only occup
I think you can use xlim
and ylim
. Also, have a look at the expand
argument for ?biplot
. Unfortunately, you did not provide any data, so let's take some sample data:
a <- princomp(USArrests)
Below the result of just calling biplot
:
biplot(a)
And now one can "zoom in" to have a closer look at "Murder" and "Rape" using xlim
and ylim
and also use the scaling argument expand
from ?biplot
:
biplot(a, expand=10, xlim=c(-0.30, 0.0), ylim=c(-0.1, 0.1))
Please note the different scaling on the top and right axis due to the expand
factor.
Does this help to make your plot mare readable?
EDIT
You also asked whether it is possible to have different colors for labels and arrows. biplot
does not support this, what you could do is to copy the code of stats:::biplot.default
and then change it according to your needs (change col
argument when plot
, axis
and text
is used).
Alternatively, you could use ggplot
for the biplot. In the post here, a simple biplot function is implemented. You could change the code as follows:
PCbiplot <- function(PC, x="PC1", y="PC2", colors=c('black', 'black', 'red', 'red')) {
# PC being a prcomp object
data <- data.frame(obsnames=row.names(PC$x), PC$x)
plot <- ggplot(data, aes_string(x=x, y=y)) + geom_text(alpha=.4, size=3, aes(label=obsnames), color=colors[1])
plot <- plot + geom_hline(aes(0), size=.2) + geom_vline(aes(0), size=.2, color=colors[2])
datapc <- data.frame(varnames=rownames(PC$rotation), PC$rotation)
mult <- min(
(max(data[,y]) - min(data[,y])/(max(datapc[,y])-min(datapc[,y]))),
(max(data[,x]) - min(data[,x])/(max(datapc[,x])-min(datapc[,x])))
)
datapc <- transform(datapc,
v1 = .7 * mult * (get(x)),
v2 = .7 * mult * (get(y))
)
plot <- plot + coord_equal() + geom_text(data=datapc, aes(x=v1, y=v2, label=varnames), size = 5, vjust=1, color=colors[3])
plot <- plot + geom_segment(data=datapc, aes(x=0, y=0, xend=v1, yend=v2), arrow=arrow(length=unit(0.2,"cm")), alpha=0.75, color=colors[4])
plot
}
Plot as follows:
fit <- prcomp(USArrests, scale=T)
PCbiplot(fit, colors=c("black", "black", "red", "yellow"))
If you play around a bit with this function, I am sure you can figure out how to set xlim
and ylim
values, etc.