So I want to apply a function over a matrix in R. This works really intuitively for simple functions:
> (function(x)x*x)(matrix(1:10, nrow=2))
[,1] [,2] [,3]
One way is to use apply
on both rows and columns:
apply(m,1:2,y)
[,1] [,2] [,3] [,4] [,5]
[1,] 2 NA 6 8 NA
[2,] 3 5 NA 9 11
You can also do it with subscripting because ==
is already vectorized:
m[m %% 3 == 0] <- NA
m <- m+1
m
[,1] [,2] [,3] [,4] [,5]
[1,] 2 NA 6 8 NA
[2,] 3 5 NA 9 11
There's a slight refinement of Dason and Josh's solution using ifelse
.
mat <- matrix(1:16, 4, 4)
ifelse(mat %% 3 == 0, NA, mat + 1)
[,1] [,2] [,3] [,4]
[1,] 2 6 NA 14
[2,] 3 NA 11 15
[3,] NA 8 12 NA
[4,] 5 9 NA 17
@Joshua Ulrich (and Dason) has a great answer. And doing it directly without the function y
is the best solution. But if you really need to call a function, you can make it faster using vapply
. It produces a vector without dimensions (as sapply
, but faster), but then you can add them back using structure
:
# Your function (optimized)
y = function(x) if (x %% 3) x+1 else NA
m <- matrix(1:1e6,1e3)
system.time( r1 <- apply(m,1:2,y) ) # 4.89 secs
system.time( r2 <- structure(sapply(m, y), dim=dim(m)) ) # 2.89 secs
system.time( r3 <- structure(vapply(m, y, numeric(1)), dim=dim(m)) ) # 1.66 secs
identical(r1, r2) # TRUE
identical(r1, r3) # TRUE
...As you can see, the vapply
approach is about 3x faster than apply
... And the reason vapply
is faster than sapply
is that sapply
must analyse the result to figure out that it can be simplified to a numeric vector. With vapply
, you specified the result type (numeric(1)
), so it doesn't have to guess...
UPDATE I figured out another (shorter) way of preserving the matrix structure:
m <- matrix(1:10, nrow=2)
m[] <- vapply(m, y, numeric(1))
You simply assign the new values to the object using m[] <-
. Then all other attributes are preserved (like dim
, dimnames
, class
etc).
For this specific example you can just do something like this
> # Create some fake data
> mat <- matrix(1:16, 4, 4)
> # Set all elements divisible by 3 to NA
> mat[mat %% 3 == 0] <- NA
> # Add 1 to all non NA elements
> mat <- mat + 1
> mat
[,1] [,2] [,3] [,4]
[1,] 2 6 NA 14
[2,] 3 NA 11 15
[3,] NA 8 12 NA
[4,] 5 9 NA 17