问题
I would like to generated unrooted neighbour joining trees from input haplotype data, and then colour the branches of the trees based on a variable. I am using the packages Ape and ggtree. The haplotypes and co-variables (metadata) are on two separate files with matching sample names. I have been able to produce trees and colour the tips of the trees by variables, but not the tree branches.
Using mock data -
# Packages
library('ggplot2')
library('ape')
library('phangorn')
library('dplyr')
library('ggtree')
library('phylobase')
# Generate haplotype dataframe
Sample <- c('Sample_A', 'Sample_B', 'Sample_C', 'Sample_D', 'Sample_E', 'Sample_F')
SNP_A <- c(0, 1, 1, 0, 1, 1)
SNP_B <- c(0, 1, 1, 0, 1, 1)
SNP_C <- c(0, 0, 1, 1, 1, 0)
SNP_D <- c(1, 1, 0, 0, 1, 0)
SNP_E <- c(0, 0, 1, 1, 0, 1)
SNP_F <- c(0, 0, 1, 1, 0, 1)
df = data.frame(Sample, SNP_A, SNP_B, SNP_C, SNP_D, SNP_E, SNP_F, row.names=c(1))
df
# Metadata
Factor_A <- c('a', 'a', 'b', 'c', 'a', 'b')
Factor_B <- c('d', 'e', 'd', 'd', 'e', 'd')
df2 = data.frame(Sample, Factor_A, Factor_B)
df2
# Generate Euclidian pairwise distance matrix
pdist = dist(as.matrix(df), method = "euclidean")
# Turn pairwise distance matrix into phylo via neighbour joining method
phylo_nj <- nj(pdist)
I can plot the tree in Ape:
# Example tree plot using Ape
plot(unroot(phylo_nj),
type="unrooted",
cex=1,
use.edge.length=TRUE,
show.tip.label = TRUE,
lab4ut="axial",
edge.width=1.5)
And I can plot the tree in ggtree, adding variables to tip points by colour/ shape:
# Plotting in ggtree
mytree <- ggtree(phylo_nj, layout="equal_angle", size=0.5, linetype=1)
mytree
# Adding metadata variables to tree plot
mytree2 <- mytree %<+% df2 + geom_tippoint(aes(shape = Factor_A,
colour = Factor_B),
size = 9,
alpha=0.7)
mytree2
But I can't work out how to make the branches coloured by a variable (rather than tip points), in either Ape or ggtree. I only want terminal branches coloured, not all of the lines of the tree. My aim is to display two (categorical) variables - one by the branch colour and one by the shape (or colour) of the tip. A crude version of what I'm after would look something like the image below (with Factor_A coded by tip shape (neutral colour as shown) and Factor_B coded by the branch colour.
Thanks in advance for the help.
回答1:
You can use the function ape::edges
after you plot the tree using ape::plot.phylo
for colouring specific edges by giving the start/end node making the edge to colour.
## Colouring the first edge with a red dashed line
plot(unroot(phylo_nj), type = "unrooted")
edges(7, 8, col = "red", lty = 2)
Or you can provide a vector of colours directly in the ape::plot.phylo
function:
## Making rainbow edges
plot(unroot(phylo_nj), type = "unrooted", edge.color = rainbow(9))
You can find out which edges to colour from your data frame by using the edge table in the phylo
object (phylo_nj$edge
). For example:
## Which labels have level "a"
labels_a <- df2$Factor_A %in% "a"
## Which edges connect to these labels?
edge_a <- phylo_nj$edge[,2] %in% match(phylo_nj$tip.label, df2$Sample[labels_a])
## Plotting the factors with the labels a coerced as numeric
plot(unroot(phylo_nj), type = "unrooted", edge.color = c("blue", "orange")[edge_a+1])
You can of course expand that to multiple levels by following this method to detect which edge leads to a tip with any factor level.
来源:https://stackoverflow.com/questions/60285367/how-to-colour-the-branches-of-an-unrooted-tree-using-a-variable-in-r