Pass rows of a data frame as parameters to a function while keeping other arguments constant

前端未结

关注

 2  1684

南笙

Following up on Pass rows of a data frame as arguments to a function in R with column names specifying the arguments:

I want to train the following model with differ

相关标签:

2条回答

面向向阳花

2021-01-13 02:31
I had a similar problem, and looked in vain until I found it in Hadley's Advanced R. This allows you to pass on parameters as they appear in a dataframe, taking the names of columns as arguments. Read here:

https://adv-r.hadley.nz/functionals.html#pmap

So, here it is. There is a solution via purrr::pmap. It maps parameters onto a function:

This is my own code which I recently used along with quanteda to mess around with the Kaggle SMS Spam dataset. These are the possibilities for my parameters:
```
tolower <- data_frame(tolower = c(TRUE, FALSE))
stem <- data_frame(stem = c(TRUE, FALSE))
remove_punct <- data_frame(remove_punct = c(TRUE, FALSE))
```
This is a bonus and not necessary, but I found I needed all of the combinations of my parameters to run a Naive Bayes model. Thanks to Y J via this SO post:
```
expand.grid.df <- function(...) Reduce(function(...) merge(..., by=NULL), list(...))
parameters <- expand.grid.df(tolower, stem, remove_punct)
```
So, now my parameters look like this:
```
> parameters
  tolower  stem remove_punct
1    TRUE  TRUE         TRUE
2   FALSE  TRUE         TRUE
3    TRUE FALSE         TRUE
4   FALSE FALSE         TRUE
5    TRUE  TRUE        FALSE
6   FALSE  TRUE        FALSE
7    TRUE FALSE        FALSE
8   FALSE FALSE        FALSE
```
And now for the magic, passing the parameters on to my function of choice (dfm) via pmap:
```
mymodels <- pmap(parameters, dfm, x = mycorpus)
```
(x = mycorpus was an extra parameter that is constant, that I want to pass on to dfm)

Here's what I got:
```
> length(mymodels)
[1] 8
> mymodels[[1]]
Document-feature matrix of: 5,572 documents, 7,714 features (99.8% sparse).
```
Hope this helps you, or anyone else looking into this method!
0 讨论(0)
发布评论:

提交评论
- 加载中...

一向

2021-01-13 02:33

You can use mapply():

models_list <- mapply(function(x,y,z) xgboost(data = train,
                                              label = df$y,
                                              # parameters
                                              nrounds = x,
                                              subsample = y,
                                              colsample_bytree = z),
                      param$nrounds, param$subsample, param$colsample_bytree, SIMPLIFY = FALSE)

It will give you a list of all your models:

>models_list[[1]]
##### xgb.Booster
raw: 25.2 Kb 
call:
  xgb.train(params = params, data = dtrain, nrounds = nrounds, 
    watchlist = watchlist, verbose = verbose, print_every_n = print_every_n, 
    early_stopping_rounds = early_stopping_rounds, maximize = maximize, 
    save_period = save_period, save_name = save_name, xgb_model = xgb_model, 
    callbacks = callbacks, subsample = ..1, colsample_bytree = ..2)
params (as set within xgb.train):
  subsample = "0.5", colsample_bytree = "0.8", silent = "1"
xgb.attributes:
  niter
callbacks:
  cb.print.evaluation(period = print_every_n)
  cb.evaluation.log()
  cb.save.model(save_period = save_period, save_name = save_name)
niter: 10
evaluation_log:
    iter train_rmse
       1   0.487354
       2   0.473657
---                
       9   0.419176
      10   0.412587

0 讨论(0)