I have a string of numbers:
n1 = c(1, 1, 0, 6, 0, 0, 10, 10, 11, 12, 0, 0, 19, 23, 0, 0)
I need to replace 0 with the corresponding number righ
Try na.locf()
from the package zoo
:
library(zoo)
n1 <- c(1, 1, 0, 6, 0, 0, 10, 10, 11, 12, 0, 0, 19, 23, 0, 0)
n1[n1 == 0] <- NA
na.locf(n1)
## [1] 1 1 1 6 6 6 10 10 11 12 12 12 19 23 23 23
This function replaces each NA
with the most recent non-NA
prior to it. This is why I substituted all 0
s with NA
before applying it.
Here's a discussion on a similar (yet not identical) issue.
EDIT: If n1
eventually consists of NA
s, try e.g.
n1 <- c(1, 1, 0, 6, 0, 0, 10, NA, 11, 12, 0, 0, 19, NA, 0, 0)
wh_na <- which(is.na(n1))
n1[n1 == 0] <- NA
n2 <- na.locf(n1)
n2[wh_na] <- NA
n2
## [1] 1 1 1 6 6 6 10 NA 11 12 12 12 19 NA 19 19
EDIT2: This approach for c(1,NA,0)
returns c(1,NA,1)
. The other two funs give c(1,NA,NA)
. In other words, here we're replacing 0 with last non-missing, non-zero value. Choose your favourite option.
EDIT3: Inspired by @Thell's Rcpp solution, I'd like to add another one - this time using "pure" R/C API.
library('inline')
sexp0 <- cfunction(signature(x="numeric"), "
x = Rf_coerceVector(x, INTSXP); // will not work for factors
R_len_t n = LENGTH(x);
SEXP ret;
PROTECT(ret = Rf_allocVector(INTSXP, n));
int lval = NA_INTEGER;
int* xin = INTEGER(x);
int* rin = INTEGER(ret);
for (R_len_t i=0; i
In this case we will get c(1,NA,NA)
for c(1,NA,0)
. Some benchmarks:
library(microbenchmark)
set.seed(1L)
n1 <- sample(c(0:10), 1e6, TRUE)
microbenchmark(sexp0(n1), rollValue(n1), n1[cummax(seq_along(n1) * (n1 != 0))])
## Unit: milliseconds
## expr min lq median uq max neval
## sexp0(n1) 2.468588 2.494233 3.198711 4.216908 63.21236 100
## rollValue(n1) 8.151000 9.359731 10.603078 12.760594 75.88901 100
## n1[cummax(seq_along(n1) * (n1 != 0))] 32.899420 36.956711 39.673726 45.419449 106.48180 100