问题
I have 22 matrices having equal number of rows (i.e. 691) and different number of columns (i.e. 22-25). I have to add the values corresponding to same row, same column in each of the matrices resulting in one single matrix of the dimension 691*25.
fullanno1 has 691 rows & 25 columns:
>colnames(fullanno1)
[1] "coding-notMod3" "coding-synonymous" "coding-synonymous-near-splice"
[4] "intergenic" "intron" "missense"
[7] "missense-near-splice" "near-gene-3" "near-gene-5"
[10] "splice-3" "splice-5" "stop-gained"
[13] "stop-gained-near-splice" "stop-lost" "utr-3"
[16] "utr-5" "CTCF" "E"
[19] "None" "PF" "R"
[22] "T" "TSS" "WE"
[25] "coding-notMod3-near-splice"
fullanno2 has 691 rows and 22 columns:
>colnames(fullanno2)
[1] "coding-synonymous" "coding-synonymous-near-splice" "intergenic"
[4] "intron" "missense" "missense-near-splice"
[7] "near-gene-3" "near-gene-5" "splice-3"
[10] "splice-5" "stop-gained" "stop-lost"
[13] "utr-3" "utr-5" "CTCF"
[16] "E" "None" "PF"
[19] "R" "T" "TSS"
[22] "WE"
Each matrix is a double matrix with numerical values. How can I add these two matrices such that I get a third matrix with dimensions 691*25. Because fullanno2 is three columns short, for those columns the resulting matrix will have values only from the first matrix.
My approach: Take a setdiff of the colnames to get columns that are not present in the smaller matrix, cbind them to the smaller matrix with 0s as values. Then add the two matrices.
> column.names<-setdiff(colnames(fullanno1),colnames(fullanno2))
[1] "coding-notMod3" "stop-gained-near-splice" "coding-notMod3-near-splice"
> column<-0
>cbind(fullanno2,column)
>colnames(fullanno2)[23]<-column.name[1]
>cbind(fullanno2,column)
>colnames(fullanno2)[24]<-column.name[2]
>cbind(fullanno2,column)
>colnames(fullanno2)[25]<-column.name[3]
But this is getting tedious for all the matrices. Any suggestions?
回答1:
So you want to sum all of the matrices to end up with one matrix? A simple but perhaps slow (I suspect, but it's probably not a big deal with your matrices) way is to use the plyr
and reshape2
libraries. You could start with a list of your matrices:
make.matrix <- function() {
cols <- sample(month.name, runif(1, 2, 12))
matrix(rnorm(length(cols)*10), 10, length(cols), dimnames=list(NULL, cols))
}
# Make 10 matrices filled with random numbers, having
# varying numbers of columns named after months
my.matrices <- replicate(10, make.matrix())
Then you can melt all the matrices into one big dataframe
matrix.df <- ldply(my.matrices, melt, varnames=c("row", "col"))
head(matrix.df)
# row col value
# 1 1 February -0.4239145
# 2 2 February 1.1773608
# 3 3 February -2.9565403
# 4 4 February 0.3955096
# 5 5 February -0.3784917
# 6 6 February -0.6234579
and then cast it back into a matrix.
sum.matrix <- acast(matrix.df, row ~ col, sum)
回答2:
You can use match
along with colnames
. For example:
> m1<-matrix(1,3,5)
> colnames(m1)<-LETTERS[1:5]
> m2<-matrix(1:9,3,3)
> colnames(m2)<-c("D","A","C")
> m1
A B C D E
[1,] 1 1 1 1 1
[2,] 1 1 1 1 1
[3,] 1 1 1 1 1
> m2
D A C
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> m3<-m1
> mcol<-match(colnames(m2),colnames(m1))
> m3[,mcol]<-m3[,mcol]+m2
> m3
A B C D E
[1,] 5 1 8 2 1
[2,] 6 1 9 3 1
[3,] 7 1 10 4 1
来源:https://stackoverflow.com/questions/19337266/r-adding-two-matrices-with-different-dimensions-based-on-columns