R - how to make PCA biplot more readable

前端 未结 1 930
旧时难觅i
旧时难觅i 2021-01-31 00:06

I have a set of observations with 23 variables.

When I use prcomp and biplot to plot the results I run into several problems:

  1. the actual plot only occup

相关标签:
1条回答
  • 2021-01-31 00:44

    I think you can use xlim and ylim. Also, have a look at the expand argument for ?biplot. Unfortunately, you did not provide any data, so let's take some sample data:

    a <- princomp(USArrests)
    

    Below the result of just calling biplot:

    biplot(a)
    

    enter image description here

    And now one can "zoom in" to have a closer look at "Murder" and "Rape" using xlim and ylim and also use the scaling argument expand from ?biplot:

    biplot(a, expand=10, xlim=c(-0.30, 0.0), ylim=c(-0.1, 0.1))
    

    enter image description here

    Please note the different scaling on the top and right axis due to the expand factor.

    Does this help to make your plot mare readable?

    EDIT

    You also asked whether it is possible to have different colors for labels and arrows. biplot does not support this, what you could do is to copy the code of stats:::biplot.default and then change it according to your needs (change col argument when plot, axis and text is used).

    Alternatively, you could use ggplot for the biplot. In the post here, a simple biplot function is implemented. You could change the code as follows:

    PCbiplot <- function(PC, x="PC1", y="PC2", colors=c('black', 'black', 'red', 'red')) {
        # PC being a prcomp object
        data <- data.frame(obsnames=row.names(PC$x), PC$x)
        plot <- ggplot(data, aes_string(x=x, y=y)) + geom_text(alpha=.4, size=3, aes(label=obsnames), color=colors[1])
        plot <- plot + geom_hline(aes(0), size=.2) + geom_vline(aes(0), size=.2, color=colors[2])
        datapc <- data.frame(varnames=rownames(PC$rotation), PC$rotation)
        mult <- min(
            (max(data[,y]) - min(data[,y])/(max(datapc[,y])-min(datapc[,y]))),
            (max(data[,x]) - min(data[,x])/(max(datapc[,x])-min(datapc[,x])))
            )
        datapc <- transform(datapc,
                v1 = .7 * mult * (get(x)),
                v2 = .7 * mult * (get(y))
                )
        plot <- plot + coord_equal() + geom_text(data=datapc, aes(x=v1, y=v2, label=varnames), size = 5, vjust=1, color=colors[3])
        plot <- plot + geom_segment(data=datapc, aes(x=0, y=0, xend=v1, yend=v2), arrow=arrow(length=unit(0.2,"cm")), alpha=0.75, color=colors[4])
        plot
    }
    

    Plot as follows:

    fit <- prcomp(USArrests, scale=T)
    PCbiplot(fit, colors=c("black", "black", "red", "yellow"))
    

    enter image description here

    If you play around a bit with this function, I am sure you can figure out how to set xlim and ylim values, etc.

    0 讨论(0)
提交回复
热议问题