sklearn之pipeline:sklearn.pipeline函数使用及其参数解释之详细攻略
目录
sklearn.pipeline函数使用及其参数解释
class Pipeline(_BaseComposition): """ Pipeline of transforms with a final estimator. Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be 'transforms', that is, they must implement fit and transform methods. The final estimator only needs to implement fit. The transformers in the pipeline can be cached using ``memory`` argument. The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. For this, it enables setting parameters of the various steps using their names and the parameter name separated by a '__', as in the example below. A step's estimator may be replaced entirely by setting the parameter with its name to another estimator, or a transformer removed by setting it to 'passthrough' or ``None``. Read more in the :ref:`User Guide <pipeline>`. .. versionadded:: 0.5 |
具有最终估计器的转换管道。 按顺序应用一组转换和一个最终的估计器。 管道的中间步骤必须是“transforms”,也就是说,它们必须实现fit和transform方法。 最终的评估器只需要实现fit。 可以使用“memory”参数缓存管道中的转换器。 管道的目的是将几个可以交叉验证的步骤组装在一起,同时设置不同的参数。 为此,它允许使用它们的名称和由“__”分隔的参数名称来设置各个步骤的参数,如下例所示。 可以通过将参数的名称设置为另一个估计器来完全替换步骤的估计器,或者通过将其设置为“passthrough”或“None”来删除转换器。 详见:ref: ' User Guide '。</pipeline> . .versionadded:: 0.5 |
Parameters ---------- steps : list. List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator. memory : str or object with the joblib.Memory interface, default=None. Used to cache the fitted transformers of the pipeline. By default, no caching is performed. If a string is given, it is the path to the caching directory. Enabling caching triggers a clone of the transformers before fitting. Therefore, the transformer instance given to the pipeline cannot be inspected directly. Use the attribute ``named_steps`` or ``steps`` to inspect estimators within the pipeline. Caching the transformers is advantageous when fitting is time consuming. verbose : bool, default=False. If True, the time elapsed while fitting each step will be printed as it is completed. Attributes ---------- named_steps: :class:`~sklearn.utils.Bunch` Dictionary-like object, with the following attributes. Read-only attribute to access any step parameter by user given name. Keys are step names and values are steps parameters.
See Also -------- sklearn.pipeline.make_pipeline : Convenience function for simplified pipeline construction. |
steps :列表。(名称、转换)元组(实现fit/转换)的列表,按照它们被链接的顺序,最后一个对象是评估器。 memory:str或物体与joblib。内存接口,默认=没有。用于缓存安装在管道中的变压器。默认情况下,不执行缓存。如果给定一个字符串,它就是到缓存目录的路径。启用缓存会在安装前触发变压器的克隆。因此,给管线的变压器实例不能直接检查。使用属性' ' named_steps ' ' '或' ' steps ' '检查管道中的评估器。当装配耗时时,缓存变压器是有利的。 verbose :bool,默认=False。如果为真,在完成每个步骤时所经过的时间将被打印出来。 属性 ---------- named_steps::类:“~ sklearn.utils.Bunch” 类字典的对象,具有以下属性。只读属性,按用户名访问任何步骤参数。键是步骤名称,值是步骤参数。
另请参阅 -------- sklearn.pipeline。make_pipeline:简化管道构造的方便函数。 |
Examples -------- >>> from sklearn.svm import SVC >>> from sklearn.preprocessing import StandardScaler >>> from sklearn.datasets import make_classification >>> from sklearn.model_selection import train_test_split >>> from sklearn.pipeline import Pipeline >>> X, y = make_classification(random_state=0) >>> X_train, X_test, y_train, y_test = train_test_split(X, y, ... random_state=0) >>> pipe = Pipeline([('scaler', StandardScaler()), ('svc', SVC())]) >>> # The pipeline can be used as any other estimator >>> # and avoids leaking the test set into the train set >>> pipe.fit(X_train, y_train) Pipeline(steps=[('scaler', StandardScaler()), ('svc', SVC())]) >>> pipe.score(X_test, y_test) 0.88 |
|
来源:oschina
链接:https://my.oschina.net/u/4280983/blog/4534583