gbm

Understanding tree structure in R gbm package

怎甘沉沦 提交于 2019-12-18 16:59:17
问题 I am having some difficulty understanding how the trees are structured in R's gbm gradient boosted machine package. Specifically, looking at the output of the pretty.gbm.tree Which features do the indices in SplitVar point to ? I trained a GBM on a dataset, here is the top ~quarter of one of my trees -- the result of a call to pretty.gbm.tree : SplitVar SplitCodePred LeftNode RightNode MissingNode ErrorReduction Weight Prediction 0 9 6.250000e+01 1 2 21 0.6634681 5981 0.005000061 1 -1 1

gbm::interact.gbm vs. dismo::gbm.interactions

孤街浪徒 提交于 2019-12-18 14:33:28
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 4 years ago . Background The reference manual for the gbm package states the interact.gbm function computes Friedman's H-statistic to assess the strength of variable interactions. the H-statistic is on the scale of [0-1]. The reference manual for the dismo package does not reference any literature for how the gbm.interactions function detects and models interactions. Instead it gives a list

gbm::interact.gbm vs. dismo::gbm.interactions

有些话、适合烂在心里 提交于 2019-12-18 14:31:34
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 4 years ago . Background The reference manual for the gbm package states the interact.gbm function computes Friedman's H-statistic to assess the strength of variable interactions. the H-statistic is on the scale of [0-1]. The reference manual for the dismo package does not reference any literature for how the gbm.interactions function detects and models interactions. Instead it gives a list

Extracting Model from GBM in R

你。 提交于 2019-12-12 02:49:17
问题 is anyone familiar with how to figure out what's going on inside a gbm model in R? Let's say we wanted to see how to predict the Petal.Length in iris. Just to keep it simple I ran: tg=gbm(Petal.Length~.,data=iris) This works and when you run: summary(tg) Then you get: Hit <Return> to see next plot: var rel.inf Petal.Width Petal.Width 67.39 Species Species 32.61 Sepal.Length Sepal.Length 0.00 Sepal.Width Sepal.Width 0.00 This makes sense intuitively. When you run pretty.gbm.tree(tg) You get:

h2o model not fit in driver node's memory error

余生颓废 提交于 2019-12-10 15:18:08
问题 I ran GBM model through R code in H2O and got below error. The same code was running fine a couple of weeks. Wondering if this is H2O side error Or configuration on the user system? water.exceptions.H2OModelBuilderIllegalArgumentException: Illegal argument(s) for GBM model: gbm-2017-04-18-15-29-53. Details: ERRR on field: _ntrees: The tree model will not fit in the driver node's memory (23.2 MB per tree x 1000 > 3.32 GB) - try decreasing ntrees and/or max_depth or increasing min_rows! 回答1:

Inconsistent predictions from predict.gbm() 2.1.4 vs 2.1.3

こ雲淡風輕ζ 提交于 2019-12-10 14:45:21
问题 This question is related to my earlier post here. I have tracked down the problem and it seems to be related to which version of gbm I use. The latest version, 2.1.4 exhibits the problem on my system (R 3.4.4 and also 3.5; both on Ubuntu 18.04) whereas version 2.1.3 works as expected: mydata <- structure(list(Count = c(1L, 3L, 1L, 4L, 1L, 0L, 1L, 2L, 0L, 0L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 0L, 2L, 3L, 1L, 4L, 3L, 0L, 4L, 1L, 2L, 1L, 1L, 0L, 2L, 1L, 4L, 1L, 5L, 3L, 0L, 0L, 1L, 1L, 0L, 1L, 0L,

Inconsistent predictions from predict.gbm()

一笑奈何 提交于 2019-12-10 12:46:56
问题 UPDATE: I have tried running the code on https://rdrr.io/snippets/ and it works fine. Therefore, I suspect a problem with my R installation, but it is extremely worrying that this can happen without errors or warnings. What are the best steps to investigate this ? I am running R 3.4.4 on Ubuntu 18.04 and gbm 2.1.4 I am fitting a boosted model to a dataset and have noticed some strange predictions. Here is a minimal working example. Please note that this is just a small sample of the dataset I

H2O - balance classes - cross validation

岁酱吖の 提交于 2019-12-10 11:34:18
问题 I would like to build a GBM model with H2O. My data set is imbalanced, so I am using the balance_classes parameter. For grid search (parameter tuning) I would like to use 5-fold cross validation. I am wondering how H2O deals with class balancing in that case. Will only the training folds be rebalanced? I want to be sure the test-fold is not rebalanced. Thank you. 回答1: In class imbalance settings, artificially balancing the test/validation set does not make any sense: these sets must remain

Deploy GBM Model in C++ | Get Predict.gbm to work outside of R

被刻印的时光 ゝ 提交于 2019-12-08 04:13:14
问题 Is there a way to export a gbm model to C++. Specifically, how do I invoke the predict.gbm function to run outside of R in order to score new datasets. I have exported the model as a PMML file but I am unsure as to how new datasets will be scores based off the PMML. I am new to R and have spent a lot of hours trying to figure this out to no avail and will appreciate any leads Thanks in advance 回答1: Here, PMML only helps you if you have a C++ based PMML evaluation engine available

Deploy GBM Model in C++ | Get Predict.gbm to work outside of R

佐手、 提交于 2019-12-08 00:00:29
Is there a way to export a gbm model to C++. Specifically, how do I invoke the predict.gbm function to run outside of R in order to score new datasets. I have exported the model as a PMML file but I am unsure as to how new datasets will be scores based off the PMML. I am new to R and have spent a lot of hours trying to figure this out to no avail and will appreciate any leads Thanks in advance Here, PMML only helps you if you have a C++ based PMML evaluation engine available (alternatively, you might use C++ to invoke a Java based PMML evaluation engine such as the JPMML-Evaluator library).