I have a matrix (2601 by 58) of particulate matter concentration estimates from an air quality model. Because real-life air quality monitors cannot measure below 0.1 ug/L,
A data.frame solution:
if(!require(plyr)){
install.packages("plyr")}
rm.neg<-colwise(function(x){
return(ifelse(x < 0.1, 0, x))})
rm.neg(data.frame(mat))
PS: the code for rm.neg can be extracted and simplified so as not to need a call to plyr, which is used to create the colwise function.
X[X < .1] <- 0
(or NA, although 0 sounds more appropriate in this case.)
Matrices are just vectors with dimensions, so you can treat them like a vector when you assign to them. In this case, you're creating a boolean vector over X that indicates the small values, and it assigns the right-hand-side to each element that's TRUE.
I think you will find that 'ifelse' is not a vector operation (its actually performing as a loop), and so it is orders of magnitudes slower than the vector equivalent. R favors vector operations, which is why apply, mapply, sapply are lightning fast for certain calculations.
Small Datasets, not a problem, but if you have an array of length 100k or more, you can go and cook a roast dinner before it finishes under any method involving a loop.
The below code should work.
For vector
minvalue <- 0
X[X < minvalue] <- minvalue
For Dataframe or Matrix.
minvalue <- 0
n <- 10 #change to whatever.
columns <- c(1:n)
X[X[,columns] < minvalue,columns] <- minvalue
Another fast method, via pmax and pmin functions, this caps entries between 0 and 1 and you can put a matrix or dataframe as the first argument no problems.
ulbound <- function(v,MAX=1,MIN=0) pmin(MAX,pmax(MIN,v))
ifelse
should work:
mat <- matrix(runif(100),ncol=5)
mat <- ifelse(mat<0.1,NA,mat)
But I would choose Harlan's answer over mine.
mat[mat < 0.1] <- NA
Just to provide an (in my opinion) interesting alternative:
If you need to clamp the values so they are never smaller than a value, you could use pmax
:
set.seed(42)
m <- matrix(rnorm(100),10)
m <- pmax(m, 0) # clamp negative values to 0
...This doesn't quite work in your case though since you want values < 0.1 to become 0.
Further equivalent methods:
let:
M=matrix(rnorm(10*10), 10, 10)
Brute force (educative)
for (i in 1:nrow(M)) {
for (j in 1:ncol(M)) if (M[i,j]<0.1 & !is.na(M[i,j]) ) M[i,j]=NA
}
If there are missing values (NA) in M, omitting !is.na
will give errors.
Another way: using recode
in package car
:
library(car)
recode(M, "lo:0.099999=NA")
Can't specify a strict inequality here, so that's why there's a bunch of 9. Put more nines and it turns into 0.1. lo
is a convenience of recode, which gives the minimum value (removing NAs).