How to fit different inputs into an sklearn Pipeline?

喜你入骨 提交于 2019-12-04 05:19:24

I think that you have to do a FeatureUnion on 2 Transformers (TfidfTransformer and POSTransformer). Of course you need to define that POSTransformer.
Maybe this article will help you.

Maybe your pipeline will look like this.

pipeline = Pipeline([
  ('features', FeatureUnion([
    ('ngram_tf_idf', Pipeline([
      ('counts_ngram', CountVectorizer()),
      ('tf_idf_ngram', TfidfTransformer())
    ])),
    ('pos_tf_idf', Pipeline([
      ('pos', POSTransformer()),          
      ('counts_pos', CountVectorizer()),
      ('tf_idf_pos', TfidfTransformer())
    ])),
    ('measure_features', MeasureFeatures())
  ])),
  ('classifier', LinearSVC())
])

And this assume that MeasureFeatures and POSTransformer are Transformers conform to the sklearn API.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!