Plotting RDA (vegan) in ggplot

寵の児 提交于 2020-01-13 19:18:11

问题


I'm still new to R, trying to learn how to use the library vegan, which I can easily plot in R with the normal plot function. The problem arises when I want to plot the data in ggplot. I know I have to extract the right data from the list I've created, but which and how? The dataset I've been practicing on can be downloaded here https://drive.google.com/file/d/0B1PQGov60aoudVR3dVZBX1VKaHc/view?usp=sharing The code I've been using to get the data transformed is this:

library(vegan)
library(dplyr)
library(ggplot2)
library(grid)
data <- read.csv(file = "People.csv", header = T, sep = ",", dec = ".", check.names = F, na.strings=c("NA", "-", "?"))
data2 <- data[,-1]
rownames(data2) <- data[,1]
data2 <- scale(data2, center = T, scale = apply(data2, 2, sd))
data2.pca <- rda(data2)

Which gives me a list I can plot using the basic "plot" and "biplot" function, but I am at a loss as to how to plot both PCA and biplot in ggplot. I would also like to color the data points by group, e.g. sex. Any help would be great.


回答1:


There is a ggbiplot(...) function in package ggbiplot, but it only works with objects of class prcomp, princomp, PCA, or lda.

plot.rda(...) just locates each case (person) in PC1 - PC2 space. biplot.rda(...) adds vectors to the PC1 and PC2 loadings for each variable in the original dataset. It turns out that plot.rda(...) and biplot.rda(...) use the data produced by summarizing the rda object, not the rda object itself.

smry <- summary(data2.pca)
df1  <- data.frame(smry$sites[,1:2])       # PC1 and PC2
df2  <- data.frame(smry$species[,1:2])     # loadings for PC1 and PC2
rda.plot <- ggplot(df1, aes(x=PC1, y=PC2)) + 
  geom_text(aes(label=rownames(df1)),size=4) +
  geom_hline(yintercept=0, linetype="dotted") +
  geom_vline(xintercept=0, linetype="dotted") +
  coord_fixed()
rda.plot

rda.biplot <- rda.plot +
  geom_segment(data=df2, aes(x=0, xend=PC1, y=0, yend=PC2), 
               color="red", arrow=arrow(length=unit(0.01,"npc"))) +
  geom_text(data=df2, 
            aes(x=PC1,y=PC2,label=rownames(df2),
                hjust=0.5*(1-sign(PC1)),vjust=0.5*(1-sign(PC2))), 
            color="red", size=4)
rda.biplot

If you compare these results to plot(data2.pca) and biplot(data2.pca) I think you'll see they are the same. Believe it or not the hardest part, by far, is getting the text to align properly wrt the arrows.




回答2:


You can use my ggvegan package for this. It is still in-development though usable for some classes of objects including rda and cca ones.

Assuming the example data and analysis you can simply do:

autoplot(data2.pca, arrows = TRUE)

to get the sort of biplot you want. This produces

You can get site labels via

autoplot(data2.pca, arrows = TRUE, geom = "text", legend = "none")

which also shows how to suppress the legend if required (legend.position takes values suitable for the same theme element in ggplot2).

You don't have a huge amount of control other the look of things with autoplot() methods (yet!), but you can use fortify() to get the data the way ggplot2 requires it and then use ideas from the other answers or study the code for ggvegan:::autoplot.rda for the specifics.

You need to install ggvegan from github as the package is not yet on CRAN:

install.packages("devtools")
devtools::install_github("gavinsimpson/ggvegan")

which will get you version 0.0-6 (or later) which includes some minor tweaks to produce neater plots than previous versions.




回答3:


According to @jlhoward you can use ggbiplot from the package with the same name. Then the only thing you need to do is to cast your rda result to prcomp result that is known by ggbiplot. Here is a function to do that:

#' Cast vegan::rda Result to base::prcomp
#'
#' Function casts a result object of unconstrained
#' \code{\link[vegan]{rda}} to a \code{\link{prcomp}} result object.
#'
#' @param x An unconstrained \code{\link[vegan]{rda}} result object.
#'
#' @importFrom vegan scores
#' @export
`as.prcomp.rda` <-
    function(x)
{
    if (!is.null(x$CCA) || !is.null(x$pCCA))
        stop("works only with unconstrained rda")
    structure(
        list(sdev = sqrt(x$CA$eig),
             rotation = x$CA$v,
             center = attr(x$CA$Xbar, "scaled:center"),
             scale = if(!is.null(scl <- attr(x$CA$Xbar, "scaled:scale")))
                         scl
                     else
                         FALSE,
             x = scores(x, display = "sites", scaling = 1,
             choices = seq_len(x$CA$rank),
             const = sqrt(x$tot.chi * (nrow(x$CA$u)-1)))),
        class = "prcomp")
}


来源:https://stackoverflow.com/questions/32194193/plotting-rda-vegan-in-ggplot

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!