Several SQL languages (I mostly use postgreSQL) have a function called coalesce which returns the first non null column element for each row. This can b
I have a ready-to-use implementation called coalesce.na
in my misc package. It seems to be competitive, but not fastest.
It will also work for vectors of different length, and has a special treatment for vectors of length one:
expr min lq median uq max neval
coalesce(aa, bb, cc) 990.060402 1030.708466 1067.000698 1083.301986 1280.734389 10
coalesce1(aa, bb, cc) 11.356584 11.448455 11.804239 12.507659 14.922052 10
coalesce1a(aa, bb, cc) 2.739395 2.786594 2.852942 3.312728 5.529927 10
coalesce2(aa, bb, cc) 2.929364 3.041345 3.593424 3.868032 7.838552 10
coalesce.na(aa, bb, cc) 4.640552 4.691107 4.858385 4.973895 5.676463 10
Here's the code:
coalesce.na <- function(x, ...) {
x.len <- length(x)
ly <- list(...)
for (y in ly) {
y.len <- length(y)
if (y.len == 1) {
x[is.na(x)] <- y
} else {
if (x.len %% y.len != 0)
warning('object length is not a multiple of first object length')
pos <- which(is.na(x))
x[pos] <- y[(pos - 1) %% y.len + 1]
}
}
x
}
Of course, as Kevin pointed out, an Rcpp solution might be faster by orders of magnitude.
Here is my solution:
coalesce <- function(x){
y <- head( x[is.na(x) == F] , 1)
return(y)
}
It returns first vaule which is not NA and it works on data.table
, for example if you want to use coalesce on few columns and these column names are in vector of strings:
column_names <- c("col1", "col2", "col3")
how to use:
ranking[, coalesce_column := coalesce( mget(column_names) ), by = 1:nrow(ranking)]