问题
I'm struggling with using mapply on functions I construct where I have one or more arguments that are needed because I am programming in a bigger environment, for example if I write a function where one of the arguments are data.
fun_test <- function(data,col,val1,val2){return(data[col][1,] * val1-val2)}
So data and col can for example be constant, but I want to vary the output of my function depending on val1 and val2:
> mapply(FUN=fun_test,mtcars,"cyl",mtcars$cyl,mtcars$cyl*2)
Error in data[col][1, ] : incorrect number of dimensions
I'm trying to understand how mapply works; I surely cannot pass mtcars, and "cyl" as a vector, can I?
EDIT: I have an environment in which the data may vary, e.g. sometimes I use mtcars, sometimes it is another dataset. So I cannot hardcode the data into the function
EDIT2: 1) I have data some dataset, 2) I have different Excel-files that I read into R, 3) I make a lookup function that extracts information from these Excel-files in R, 4) for one or two variables (from the dataset) at the time I go into the lookup-functions I created and extract information.
So these lookup functions depend on both the data (the variables I need to lookup) and the Excel-files that I use to do the looking up.
回答1:
mapply
is a multidimensional lapply
. This means that instead of iterating over just one object (i.e. the columns of a data.frame or the elements of a vector), it iterates over multiple ones at the same time. The only condition is that the length of those objects needs to be the same, i.e. the columns of a data.frame and the lengths of the vectors. So, you cannot pass constants (unless you pass in a vector of the same constants to match the length, but why would you do that).
Try an easy example (sums the same indexes of the vectors):
mapply(sum, 1:10, 11:20)
So, in your case, just pass in the constants straight into the function:
fun_test <- function(val1, val2){return(mtcars['cyl'] * val1 - val2)}
mapply(FUN=fun_test, mtcars$cyl, mtcars$cyl*2)
Update:
Then I think what you need is to include mapply
within your function. In that way you can add any argument you like (both constants and variable). It would look like this:
myfunc <- function(data, col, val1, val2) {
fun_test <- function(val1, val2) {
data[col] * val1 - val2
}
mapply(FUN=fun_test, val1, val2)
}
myfunc(mtcars, 'cyl', mtcars$cyl, mtcars$cyl*2)
回答2:
If you want to pass dataframe as constant value pass it as list so that it is recycled completely otherwise it will pass each column separately in mapply
fun_test <- function(data,col,val1,val2){return(data[1, col] * val1-val2)}
mapply(FUN=fun_test, list(mtcars),"cyl",mtcars$cyl,mtcars$cyl*2)
#[1] 24 24 16 24 32 24 32 16 16 24 24 ......
So the first value 24
in the output can be reproduced by
mtcars[1, "cyl"] * mtcars$cyl[1] - mtcars$cyl[1]*2
#[1] 24
I know this is an example and actual implementation is different but you can get the same output directly by doing
mtcars[1, "cyl"] * mtcars$cyl - mtcars$cyl*2
To understand the difference between both the calls we can debug the function add browser()
in the function
fun_test <- function(data,col,val1,val2){
browser()
return(data[1, col] * val1-val2)
}
Now, call the function and check the parameter in the function
mapply(FUN=fun_test, mtcars,"cyl",mtcars$cyl,mtcars$cyl*2)
Browse[1]> data
# [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2
# 10.4 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4
# 15.8 19.7 15.0 21.4
this is first column in mtcars
which is mpg
(Check mtcars$mpg
).
It is a numeric vector and now you are trying to subset mpg
column and index 1 from it which gives you the same error
mtcars$mpg["cyl"][1, ]
Error in mtcars$mpg["cyl"][1, ] : incorrect number of dimensions
Now in 2nd case when we pass dataframe as list, check data
mapply(FUN=fun_test, list(mtcars),"cyl",mtcars$cyl,mtcars$cyl*2)
Browse[1]> data
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#....
It is complete dataframe and then you can subset from this
>data[1, "cyl"]
#[1] 6
PS - I don't know the context on why this being done and I believe there would be better ways to handle it.
来源:https://stackoverflow.com/questions/56443892/mapply-with-multiple-arguments-where-one-argument-is-constant-data