I've fit a Pipeline
object with RandomizedSearchCV
pipe_sgd = Pipeline([('scl', StandardScaler()),
('clf', SGDClassifier(n_jobs=-1))])
param_dist_sgd = {'clf__loss': ['log'],
'clf__penalty': [None, 'l1', 'l2', 'elasticnet'],
'clf__alpha': np.linspace(0.15, 0.35),
'clf__n_iter': [3, 5, 7]}
sgd_randomized_pipe = RandomizedSearchCV(estimator = pipe_sgd,
param_distributions=param_dist_sgd,
cv=3, n_iter=30, n_jobs=-1)
sgd_randomized_pipe.fit(X_train, y_train)
I want to access the coef_
attribute of the best_estimator_
but I'm unable to do that. I've tried accessing coef_
with the code below.
sgd_randomized_pipe.best_estimator_.coef_
However I get the following AttributeError...
AttributeError: 'Pipeline' object has no attribute 'coef_'
The scikit-learn docs say that coef_
is an attribute of SGDClassifier
, which is the class of my base_estimator_
.
What am I doing wrong?
You can always use the names you assigned to them while making the pipeline by using the named_steps
dict.
scaler = sgd_randomized_pipe.best_estimator_.named_steps['scl']
classifier = sgd_randomized_pipe.best_estimator_.named_steps['clf']
and then access all the attributes like coef_
, intercept_
etc. which are available to corresponding fitted estimator.
This is the formal attribute exposed by the Pipeline as specified in the documentation:
named_steps : dict
Read-only attribute to access any step parameter by user given name. Keys are step names and values are steps parameters.
I've found one way to do this is by chained indexing with the steps
attribute...
sgd_randomized_pipe.best_estimator_.steps[1][1].coef_
Is this best practice, or is there another way?
I think this should work:
sgd_randomized_pipe.named_steps['clf'].coef_
来源:https://stackoverflow.com/questions/43856280/return-coefficients-from-pipeline-object-in-sklearn