I have a dataframe z
and I want to create the new column based on the values of two old columns of z
. Following is the process:
&g
Generate a multipler vector:
tt <- rep(1, max(z$x))
tt[2] <- 2
tt[4] <- 4
tt[7] <- 3
And here is your new column:
> z$t * tt[z$x]
[1] 21 44 23 96 25 26 81 28 29 30
> z$q <- z$t * tt[z$x]
> z
x y t q
1 1 11 21 21
2 2 12 22 44
3 3 13 23 23
4 4 14 24 96
5 5 15 25 25
6 6 16 26 26
7 7 17 27 81
8 8 18 28 28
9 9 19 29 29
10 10 20 30 30
This will not work if there are negative values in z$x
.
Edited
Here is a generalization of the above, where a function is used to generate the multiplier vector. In fact, we create a function based on parameters.
We want to transform the following values:
2 -> 2
4 -> 4
7 -> 3
Otherwise a default of 1 is taken.
Here is a function which generates the desired function:
f <- function(default, x, y) {
x.min <- min(x)
x.max <- max(x)
y.vals <- rep(default, x.max-x.min+1)
y.vals[x-x.min+1] <- y
function(z) {
result <- rep(default, length(z))
tmp <- z>=x.min & z<=x.max
result[tmp] <- y.vals[z[tmp]-x.min+1]
result
}
}
Here is how we use it:
x <- c(2,4,7)
y <- c(2,4,3)
g <- f(1, x, y)
g
is the function that we want. It should be clear that any mapping can be supplied via the x
and y
parameters to f
.
g(z$x)
## [1] 1 2 1 4 1 1 3 1 1 1
g(z$x)*z$t
## [1] 21 44 23 96 25 26 81 28 29 30
It should be clear this only works for integer values.
Here's a version of an SQL decode
in R for character vectors (untested with factors) that operates just like the SQL version. i.e. it takes an arbitrary number of target/replacement pairs, and optional last argument that acts as a default value (note that the default won't overwrite NAs).
I can see it being pretty useful in conjunction with dplyr
's mutate
operation.
> x <- c("apple","apple","orange","pear","pear",NA)
> decode(x, apple, banana)
[1] "banana" "banana" "orange" "pear" "pear" NA
> decode(x, apple, banana, fruit)
[1] "banana" "banana" "fruit" "fruit" "fruit" NA
> decode(x, apple, banana, pear, passionfruit)
[1] "banana" "banana" "orange" "passionfruit" "passionfruit" NA
> decode(x, apple, banana, pear, passionfruit, fruit)
[1] "banana" "banana" "fruit" "passionfruit" "passionfruit" NA
Here's the code I'm using, with a gist I'll keep up to date here (link).
decode <- function(x, ...) {
args <- as.character((eval(substitute(alist(...))))
replacements <- args[1:length(args) %% 2 == 0]
targets <- args[1:length(args) %% 2 == 1][1:length(replacements)]
if(length(args) %% 2 == 1)
x[! x %in% targets & ! is.na(x)] <- tail(args,1)
for(i in 1:length(targets))
x <- ifelse(x == targets[i], replacements[i], x)
return(x)
}