Suppose I have the following matrix:
cm<-structure(c(100, 200, 400, 800, 100, 200, 400, 800, 100, 200,
400, 800, 100, 200, 400, 800, 100, 200, 400, 800,
Use apply
and all.equal
to compare each row against the target row. The problem with using ==
is that it only checks the it recycles elements of a vector for comparison, whereas you want to see if all values in the row vector match a4[1,]
so you should use all.equal
. The consequence is that it's return value is not a logical but instead a character string describing differences between the objects, which makes it a little messier to work with than ==
alone:
which(apply(cm, 1, function(x) all.equal(x[1:3], a4[1,])) == "TRUE")
# [1] 1
You can also make that a bit simpler by using identical
instead of all.equal
:
which(apply(cm, 1, function(x) identical(x[1:3], a4[1,])))
# [1] 1
Then extract:
cm[apply(cm, 1, function(x) identical(x[1:3], a4[1,])),,drop=FALSE]
# Var1 Var2 Var3 n1
# [1,] 100 0 -0.4 1
To clarify exactly what's happening, consider what ==
does implicitly when you pass a matrix argument:
which(cm[,1:3]==a4[1,])
# [1] 1 13 23 35 42 45 48 51 53 56 59
That result is the same as converting the matrix to a vector:
as.vector(cm[,1:3])
# [1] 100.0 200.0 400.0 800.0 100.0 200.0 400.0 800.0 100.0 200.0 400.0 800.0 100.0 200.0 400.0 800.0 100.0 200.0 400.0 800.0 0.0 0.0 0.0 0.0 0.5 0.5 0.5
# [28] 0.5 1.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.5 0.5 0.5 0.5 -0.4 -0.4 -0.4 -0.4 -0.4 -0.4 -0.4 -0.4 -0.4 -0.4 -0.4 -0.4 0.0 0.0
# [55] 0.0 0.0 0.0 0.0 0.0 0.0
which(as.vector(cm[,1:3])==a4[1,])
# [1] 1 13 23 35 42 45 48 51 53 56 59
Thus, the positions are positions within the vector representation of cm
, not rows in the matrix representation. ==
comparisons can also be dangerous (again do to the recycling noted above) when trying to compare vectors that are not of equivalent length or where one vector's length is not a multiple of the other, which will produce a warning:
1:2 == 1:3
# [1] TRUE TRUE FALSE
# Warning message:
# In 1:2 == 1:3 :
# longer object length is not a multiple of shorter object length
Whereas there is no warning when recycling is used:
1:2 == 1:6
# [1] TRUE TRUE FALSE FALSE FALSE FALSE
The function row.match
in prodlim
is easy to use, and ideal for your problem.
library(prodlim)
row.match(a4[1,], cm[,1:3])
[1] 1