How can I use XGBOOST https://github.com/dmlc/xgboost/ library in c++? I have founded Python and Java API, but I can't found API for c++
问题:
回答1:
I ended up using the C API, see below an example:
// create the train data int cols=3,rows=5; float train[rows][cols]; for (int i=0;i<rows;i++) for (int j=0;j<cols;j++) train[i][j] = (i+1) * (j+1); float train_labels[rows]; for (int i=0;i<rows;i++) train_labels[i] = 1+i*i*i; // convert to DMatrix DMatrixHandle h_train[1]; XGDMatrixCreateFromMat((float *) train, rows, cols, -1, &h_train[0]); // load the labels XGDMatrixSetFloatInfo(h_train[0], "label", train_labels, rows); // read back the labels, just a sanity check bst_ulong bst_result; const float *out_floats; XGDMatrixGetFloatInfo(h_train[0], "label" , &bst_result, &out_floats); for (unsigned int i=0;i<bst_result;i++) std::cout << "label[" << i << "]=" << out_floats[i] << std::endl; // create the booster and load some parameters BoosterHandle h_booster; XGBoosterCreate(h_train, 1, &h_booster); XGBoosterSetParam(h_booster, "booster", "gbtree"); XGBoosterSetParam(h_booster, "objective", "reg:linear"); XGBoosterSetParam(h_booster, "max_depth", "5"); XGBoosterSetParam(h_booster, "eta", "0.1"); XGBoosterSetParam(h_booster, "min_child_weight", "1"); XGBoosterSetParam(h_booster, "subsample", "0.5"); XGBoosterSetParam(h_booster, "colsample_bytree", "1"); XGBoosterSetParam(h_booster, "num_parallel_tree", "1"); // perform 200 learning iterations for (int iter=0; iter<200; iter++) XGBoosterUpdateOneIter(h_booster, iter, h_train[0]); // predict const int sample_rows = 5; float test[sample_rows][cols]; for (int i=0;i<sample_rows;i++) for (int j=0;j<cols;j++) test[i][j] = (i+1) * (j+1); DMatrixHandle h_test; XGDMatrixCreateFromMat((float *) test, sample_rows, cols, -1, &h_test); bst_ulong out_len; const float *f; XGBoosterPredict(h_booster, h_test, 0,0,&out_len,&f); for (unsigned int i=0;i<out_len;i++) std::cout << "prediction[" << i << "]=" << f[i] << std::endl; // free xgboost internal structures XGDMatrixFree(h_train[0]); XGDMatrixFree(h_test); XGBoosterFree(h_booster);
回答2:
Use XGBoost C API.
BoosterHandle booster; const char *model_path = "/path/of/model"; // create booster handle first XGBoosterCreate(NULL, 0, &booster); // by default, the seed will be set 0 XGBoosterSetParam(booster, "seed", "0"); // load model XGBoosterLoadModel(booster, model_path); const int feat_size = 100; const int num_row = 1; float feat[num_row][feat_size]; // create some fake data for predicting for (int i = 0; i < num_row; ++i) { for(int j = 0; j < feat_size; ++j) { feat[i][j] = (i + 1) * (j + 1) } } // convert 2d array to DMatrix DMatrixHandle dtest; XGDMatrixCreateFromMat(reinterpret_cast<float*>(feat), num_row, feat_size, NAN, &dtest); // predict bst_ulong out_len; const float *f; XGBoosterPredict(booster, dtest, 0, 0, &out_len, &f); assert(out_len == num_row); std::cout << f[0] << std::endl; // free memory XGDMatrixFree(dtest); XGBoosterFree(booster);
Note when you want to load an existing model(like above code shows), you have to ensure the data format in training is the same as in predicting. So, if you predict with XGBoosterPredict, which accepts a dense matrix as parameter, you have to use dense matrix in training.
Training with libsvm format and predict with dense matrix may cause wrong predictions, as XGBoost FAQ says:
“Sparse” elements are treated as if they were “missing” by the tree booster, and as zeros by the linear booster. For tree models, it is important to use consistent data formats during training and scoring.
回答3:
There is no example I am aware of. there is a c_api.h file that contains a C/C++ api for the package, and you'll have to find your way using it. I've just did that. Took me a few hours reading the code and trying few things out. But eventually I managed to create a working C++ example of xgboost.
回答4:
To solve this problem we runs the xgboost program from C++ source code.