Pyspark - Get all parameters of models created with ParamGridBuilder
I'm using PySpark 2.0 for a Kaggle competition. I'd like to know the behavior of a model ( RandomForest ) depending on different parameters. ParamGridBuilder() allows to specify different values for a single parameters, and then perform (I guess) a Cartesian product of the entire set of parameters. Assuming my DataFrame is already defined: rdc = RandomForestClassifier() pipeline = Pipeline(stages=STAGES + [rdc]) paramGrid = ParamGridBuilder().addGrid(rdc.maxDepth, [3, 10, 20]) .addGrid(rdc.minInfoGain, [0.01, 0.001]) .addGrid(rdc.numTrees, [5, 10, 20, 30]) .build() evaluator =