I want to write a function that can take columns within a data frame or column names and the data frame they come from as arguments.
df <- data.frame(x = c(1:
To sort of piggy-back off of Cettt - something like this may be what you're looking for:
df <- data.frame(x = c(1:5), y = c(6:10), z = LETTERS[1:5])
my_fxn <- function (aaa, bbb, ccc, data) {
if (!missing(data)) {
aaa = as.numeric(data[[aaa]])
bbb = as.numeric(data[[bbb]])
ccc = as.character(data[[ccc]])
}
print(aaa[1])
}
my_fxn("x", "y", "z", df)
#> [1] 1
With the use of enquo()
from library(dplyr)
, we no longer need to enter characters as the function variables:
library(dplyr)
my_fxn <- function (aaa, bbb, ccc, data) {
aaa <- enquo(aaa)
bbb <- enquo(bbb)
ccc <- enquo(ccc)
if (!missing(data)) {
aaa = as.numeric(pull(data, !!aaa))
bbb = as.numeric(pull(data, !!bbb))
ccc = as.character(pull(data, !!ccc))
}
print(aaa[1])
}
my_fxn(x, y, z, df)
#> [1] 1
More info about function building with enquo()
and !!
can be found here: https://dplyr.tidyverse.org/articles/programming.html#programming-recipes
Finally, a base R solution using deparse()
and substitute()
:
my_fxn <- function (aaa, bbb, ccc, data) {
aaa <- deparse(substitute(aaa))
bbb <- deparse(substitute(bbb))
ccc <- deparse(substitute(ccc))
if (!missing(data)) {
aaa = as.numeric(data[[aaa]])
bbb = as.numeric(data[[bbb]])
ccc = as.character(data[[ccc]])
}
print(aaa[1])
}
my_fxn(x, y, z, df)
#> [1] 1
The problem is that when calling my_fxn(x, y, z, df)
the object x
is not defined.
Hence df$x
does not return column x
but NA
.
Consider this small example:
df <- data.frame(x = 1:3, y = 4:6)
x <- "y"
df$x # returns column x
[1] 1 2 3
df[,x] #returns column y since the value which is stored in x is "y"
[1] 4 5 6
To circumvent your problem you can use data[, aaa]
instead of data$aaa
.
Yet another alternative would be to use the dplyr
package where you can use select(data, aaa)
.