How can I use a custom feature selection function in scikit-learn's `pipeline`

后端 未结 5 1901
闹比i
闹比i 2021-01-30 18:47

Let\'s say that I want to compare different dimensionality reduction approaches for a particular (supervised) dataset that consists of n>2 features via cross-validation and by u

5条回答
  •  醉梦人生
    2021-01-30 19:26

    You can use the following custom transformer to select the columns specified:

    #Custom Transformer that extracts columns passed as an argument to its constructor

    class FeatureSelector( BaseEstimator, TransformerMixin ):
    
        #Class Constructor 
        def __init__( self, feature_names ):
            self._feature_names = feature_names 
    
        #Return self nothing else to do here    
        def fit( self, X, y = None ):
            return self 
    
        #Method that describes what we need this transformer to do
        def transform( self, X, y = None ):
            return X[ self._feature_names ]`
    

    Here feature_names is the list of features which you want to select For more details, you can refer to this link [1]: https://towardsdatascience.com/custom-transformers-and-ml-data-pipelines-with-python-20ea2a7adb65

提交回复
热议问题