Creating a Custom Objective Function in for XGBoost.XGBRegressor

问题

So I am relatively new to the ML/AI game in python, and I'm currently working on a problem surrounding the implementation of a custom objective function for XGBoost.

My differential equation knowledge is pretty rusty so I've created a custom obj function with a gradient and hessian that models the mean squared error function that is ran as the default objective function in XGBRegressor to make sure that I am doing all of this correctly. The problem is, the results of the model (the error outputs are close but not identical for the most part (and way off for some points). I don't know what I'm doing wrong or how that could be possible if I am computing things correctly. If you all could look at this an maybe provide insight into where I am wrong, that would be awesome!

The original code without a custom function is:

    import xgboost as xgb

    reg = xgb.XGBRegressor(n_estimators=150, 
                   max_depth=2,
                   objective ="reg:squarederror", 
                   n_jobs=-1)

    reg.fit(X_train, y_train)

    y_pred_test = reg.predict(X_test)

and my custom objective function for MSE is as follows:

    def gradient_se(y_true, y_pred):
        #Compute the gradient squared error.
        return (-2 * y_true) + (2 * y_pred)

    def hessian_se(y_true, y_pred):
        #Compute the hessian for squared error
        return 0*(y_true + y_pred) + 2

   def custom_se(y_true, y_pred):
        #squared error objective. A simplified version of MSE used as
        #objective function.

        grad = gradient_se(y_true, y_pred)
        hess = hessian_se(y_true, y_pred)
        return grad, hess

the documentation reference is here

Thanks!

回答1:

According to the documentation, the library passes the predicted values (y_pred in your case) and the ground truth values (y_true in your case) in this order.

You pass the y_true and y_pred values in reversed order in your custom_se(y_true, y_pred) function to both the gradient_se and hessian_se functions. For the hessian it doesn't make a difference since the hessian should return 2 for all x values and you've done that correctly.

For the gradient_se function you've incorrect signs for y_true and y_pred.

The correct implementation is as follows:

    def gradient_se(y_pred, y_true):
        #Compute the gradient squared error.
        return 2*(y_pred - y_true)

    def hessian_se(y_pred, y_true):
        #Compute the hessian for squared error
        return 0*y_true + 2

   def custom_se(y_pred, y_true):
        #squared error objective. A simplified version of MSE used as
        #objective function.

        grad = gradient_se(y_pred, y_true)
        hess = hessian_se(y_pred, y_true)
        return grad, hess

来源：https://stackoverflow.com/questions/59683944/creating-a-custom-objective-function-in-for-xgboost-xgbregressor

标签

python

machine-learning

xgboost

gradient-descent

hessian-matrix