I want to update my code of pyspark. In the pyspark, it must put the base model in a pipeline, the office demo of pipeline use the LogistictRegression as an base model. However,
There is an XBoost Implementation for Spark 2.4 and over here:
https://xgboost.readthedocs.io
Note that this is an external library but it should work easily with spark.