As an example of cross-validation without any preprocessing, I can do something like this:
tuned_params = [{\"penalty\" : [\"l2\", \"l1\"]}]
from sklearn
Per the documentation, if you employ Pipeline
, this can be done for you. From the docs, just above section 3.1.1.1, emphasis mine:
Just as it is important to test a predictor on data held-out from training, preprocessing (such as standardization, feature selection, etc.) and similar data transformations similarly should be learnt from a training set and applied to held-out data for prediction [...] A Pipeline makes it easier to compose estimators, providing this behavior under cross-validation[.]
More relevant information on pipelines available here.