How can I fit the test data using min max scaler when I am loading the model?

China☆狼群 提交于 2020-02-22 22:43:35

问题


I am doing auto encoder model.I have saved the model before which I scaled the data using min max scaler.

X_train = df.values
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)

After doing this I fitted the model and saved it as 'h5' file.Now when I give test data, after loading the saved model naturally it should be scaled as well.

So when I load the model and scale it by using

X_test_scaled  = scaler.transform(X_test)

It gives the error

NotFittedError: This MinMaxScaler instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

So I gave X_test_scaled = scaler.fit_transform(X_test) (Which I had a hunch that it is foolish)did gave a result(after loading saved model and test) which was different when I trained it and test it together. I have saved around 4000 models now for my purpose(So I cant train and save it all again as it costs a lot time,So I want a way out).

Is there a way I can scale the test data by transforming it the way I trained it(may be saving the scaled values, I do not know).Or may be descale the model so that I can test the model on non-scaled data.

If I under-emphasized or over-emphasized any point ,please let me know in the comments!


回答1:


X_test_scaled  = scaler.fit_transform(X_test)

will scale X_test given the minimum and maximum values of features in X_test and not X_train.

The reason your original code did not work is because you probably did not save scaler after fitting it to X_train or overwrote it somehow (for e.g., by re-initializing it). This is why the error was thrown as scaler was not fitted to any data.

When you then call X_test_scaled = scaler.fit_transform(X_test), you are fitting scaler to X_test and simultaneously tranforming X_test, which was why the code was able to run, but this step is incorrect as you already surmised.

What you want is

X_train = df.values
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)

# Save scaler
import pickle as pkl
with open("scaler.pkl", "wb") as outfile:
    pkl.dump(scaler, outfile)

# Some other code for training your autoencoder
# ...

Then in your test script

# During test time
# Load scaler that was fitted on training data
with open("scaler.pkl", "rb") as infile:
    scaler = pkl.load(infile)
    X_test_scaled = scaler.transform(X_test)  # Note: not fit_transform.

Note you don't have to re-fit the scaler object after loading it back from disk. It contains all the information (the scaling factors etc.) obtained from the training data. You just call it on X_test.



来源:https://stackoverflow.com/questions/54920707/how-can-i-fit-the-test-data-using-min-max-scaler-when-i-am-loading-the-model

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!