I used the below code to create k-means clusters using Scikit learn.
kmean = KMeans(n_clusters=nclusters,n_jobs=-1,random_state=2376,max_iter=1000,n_init=100
Yes. Whether the sklearn.cluster.KMeans
object is pickled or not (if you un-pickle it correctly, you'll be dealing with the "same" original object) does not affect that you can use the predict
method to cluster a new observation.
An example:
from sklearn.cluster import KMeans
from sklearn.externals import joblib
model = KMeans(n_clusters = 2, random_state = 100)
X = [[0,0,1,0], [1,0,0,1], [0,0,0,1],[1,1,1,0],[0,0,0,0]]
model.fit(X)
Out:
KMeans(copy_x=True, init='k-means++', max_iter=300, n_clusters=2, n_init=10,
n_jobs=1, precompute_distances='auto', random_state=100, tol=0.0001,
verbose=0)
Continue:
joblib.dump(model, 'model.pkl')
model_loaded = joblib.load('model.pkl')
model_loaded
Out:
KMeans(copy_x=True, init='k-means++', max_iter=300, n_clusters=2, n_init=10,
n_jobs=1, precompute_distances='auto', random_state=100, tol=0.0001,
verbose=0)
See how the n_clusters
and random_state
parameters are the same between the model
and model_new
objects? You're good to go.
Predict with the "new" model:
model_loaded.predict([0,0,0,0])
Out[64]: array([0])