Pass rows of a data frame as parameters to a function while keeping other arguments constant

前端 未结 2 1681
南笙
南笙 2021-01-13 01:29

Following up on Pass rows of a data frame as arguments to a function in R with column names specifying the arguments:

I want to train the following model with differ

相关标签:
2条回答
  • 2021-01-13 02:31

    I had a similar problem, and looked in vain until I found it in Hadley's Advanced R. This allows you to pass on parameters as they appear in a dataframe, taking the names of columns as arguments. Read here:

    https://adv-r.hadley.nz/functionals.html#pmap

    So, here it is. There is a solution via purrr::pmap. It maps parameters onto a function:

    This is my own code which I recently used along with quanteda to mess around with the Kaggle SMS Spam dataset. These are the possibilities for my parameters:

    tolower <- data_frame(tolower = c(TRUE, FALSE))
    stem <- data_frame(stem = c(TRUE, FALSE))
    remove_punct <- data_frame(remove_punct = c(TRUE, FALSE))
    

    This is a bonus and not necessary, but I found I needed all of the combinations of my parameters to run a Naive Bayes model. Thanks to Y J via this SO post:

    expand.grid.df <- function(...) Reduce(function(...) merge(..., by=NULL), list(...))
    parameters <- expand.grid.df(tolower, stem, remove_punct)
    

    So, now my parameters look like this:

    > parameters
      tolower  stem remove_punct
    1    TRUE  TRUE         TRUE
    2   FALSE  TRUE         TRUE
    3    TRUE FALSE         TRUE
    4   FALSE FALSE         TRUE
    5    TRUE  TRUE        FALSE
    6   FALSE  TRUE        FALSE
    7    TRUE FALSE        FALSE
    8   FALSE FALSE        FALSE
    

    And now for the magic, passing the parameters on to my function of choice (dfm) via pmap:

    mymodels <- pmap(parameters, dfm, x = mycorpus)
    

    (x = mycorpus was an extra parameter that is constant, that I want to pass on to dfm)

    Here's what I got:

    > length(mymodels)
    [1] 8
    > mymodels[[1]]
    Document-feature matrix of: 5,572 documents, 7,714 features (99.8% sparse).
    

    Hope this helps you, or anyone else looking into this method!

    0 讨论(0)
  • 2021-01-13 02:33

    You can use mapply():

    models_list <- mapply(function(x,y,z) xgboost(data = train,
                                                  label = df$y,
                                                  # parameters
                                                  nrounds = x,
                                                  subsample = y,
                                                  colsample_bytree = z),
                          param$nrounds, param$subsample, param$colsample_bytree, SIMPLIFY = FALSE)
    

    It will give you a list of all your models:

    >models_list[[1]]
    ##### xgb.Booster
    raw: 25.2 Kb 
    call:
      xgb.train(params = params, data = dtrain, nrounds = nrounds, 
        watchlist = watchlist, verbose = verbose, print_every_n = print_every_n, 
        early_stopping_rounds = early_stopping_rounds, maximize = maximize, 
        save_period = save_period, save_name = save_name, xgb_model = xgb_model, 
        callbacks = callbacks, subsample = ..1, colsample_bytree = ..2)
    params (as set within xgb.train):
      subsample = "0.5", colsample_bytree = "0.8", silent = "1"
    xgb.attributes:
      niter
    callbacks:
      cb.print.evaluation(period = print_every_n)
      cb.evaluation.log()
      cb.save.model(save_period = save_period, save_name = save_name)
    niter: 10
    evaluation_log:
        iter train_rmse
           1   0.487354
           2   0.473657
    ---                
           9   0.419176
          10   0.412587
    
    0 讨论(0)
提交回复
热议问题