What is the difference between partial fit and warm start?

前端 未结 4 1497
闹比i
闹比i 2021-02-01 21:45

Context:

I am using Passive Aggressor from scikit library and confused whether to use warm start or partial fit.

Efforts hitherto

相关标签:
4条回答
  • 2021-02-01 21:57

    First, let us look at the difference between .fit() and .partial_fit().

    .fit() would let you train from the scratch. Hence, you could think of this as a option that can be used only once for a model. If you call .fit() again with a new set of data, the model would be build on the new data and will have no influence of previous dataset.

    .partial_fit() would let you update the model with incremental data. Hence, this option can be used more than once for a model. This could be useful, when the whole dataset cannot be loaded into the memory, refer here.

    If both .fit() or .partial_fit() are going to be used once, then it makes no difference.

    warm_start can be only used in .fit(), it would let you start the learning from the co-eff of previous fit(). Now it might sound similar to the purpose to partial_fit(), but recommended way would be partial_fit(). May be do the partial_fit() with same incremental data few number of times, to improve the learning.

    0 讨论(0)
  • 2021-02-01 21:59

    I don't know about the Passive Aggressor, but at least when using the SGDRegressor, partial_fit will only fit for 1 epoch, whereas fit will fit for multiple epochs (until the loss converges or max_iter is reached). Therefore, when fitting new data to your model, partial_fit will only correct the model one step towards the new data, but with fit and warm_start it will act as if you would combine your old data and your new data together and fit the model once until convergence.

    Example:

    from sklearn.linear_model import SGDRegressor
    import numpy as np
    
    np.random.seed(0)
    X = np.linspace(-1, 1, num=50).reshape(-1, 1)
    Y = (X * 1.5 + 2).reshape(50,)
    
    modelFit = SGDRegressor(learning_rate="adaptive", eta0=0.01, random_state=0, verbose=1,
                         shuffle=True, max_iter=2000, tol=1e-3, warm_start=True)
    modelPartialFit = SGDRegressor(learning_rate="adaptive", eta0=0.01, random_state=0, verbose=1,
                         shuffle=True, max_iter=2000, tol=1e-3, warm_start=False)
    # first fit some data
    modelFit.fit(X, Y)
    modelPartialFit.fit(X, Y)
    # for both: Convergence after 50 epochs, Norm: 1.46, NNZs: 1, Bias: 2.000027, T: 2500, Avg. loss: 0.000237
    print(modelFit.coef_, modelPartialFit.coef_) # for both: [1.46303288]
    
    # now fit new data (zeros)
    newX = X
    newY = 0 * Y
    
    # fits only for 1 epoch, Norm: 1.23, NNZs: 1, Bias: 1.208630, T: 50, Avg. loss: 1.595492:
    modelPartialFit.partial_fit(newX, newY)
    
    # Convergence after 49 epochs, Norm: 0.04, NNZs: 1, Bias: 0.000077, T: 2450, Avg. loss: 0.000313:
    modelFit.fit(newX, newY)
    
    print(modelFit.coef_, modelPartialFit.coef_) # [0.04245779] vs. [1.22919864]
    newX = np.reshape([2], (-1, 1))
    print(modelFit.predict(newX), modelPartialFit.predict(newX)) # [0.08499296] vs. [3.66702685]
    
    0 讨论(0)
  • 2021-02-01 22:00

    About difference. Warm start it just an attribute of class. Partial fit it is method of this class. It's basically different things.

    About same functionalities. Yes, partial fit will use self.coef_ because it still needed to get some values to update on training period. And for empty coef_init we just put zero values to self.coef_ and go to the next step of training.

    Description.

    For first start: Whatever how (with or without warm start). We will train on zero coefficients but in result we will save average of our coefficients.

    N+1 start:

    With warm start. We will check via method _allocate_parameter_mem our previous coefficients and take it to train. In result save our average coefficients.

    Without warm start. We will put zero coefficients (as first start) and go to training step. In result we will still write average coefficients to memory.

    0 讨论(0)
  • 2021-02-01 22:01

    If warm_start = False, each subsequent call to .fit() (after an initial call to .fit() or partial_fit()) will reset the model's trainable parameters for the initialisation. If warm_start = True, each subsequent call to .fit() (after an initial call to .fit() or partial_fit()) will retain the values of the model's trainable parameters from the previous run, and use those initially. Regardless of the value of warm_start, each call to partial_fit() will retain the previous run's model parameters and use those initially.

    Example using MLPRegressor:

    import sklearn.neural_network
    import numpy as np
    np.random.seed(0)
    x = np.linspace(-1, 1, num=50).reshape(-1, 1)
    y = (x * 1.5 + 2).reshape(50,)
    cold_model = sklearn.neural_network.MLPRegressor(hidden_layer_sizes=(), warm_start=False, max_iter=1)
    warm_model = sklearn.neural_network.MLPRegressor(hidden_layer_sizes=(), warm_start=True, max_iter=1)
    
    cold_model.fit(x,y)
    print cold_model.coefs_, cold_model.intercepts_
    #[array([[0.17009494]])] [array([0.74643783])]
    cold_model.fit(x,y)
    print cold_model.coefs_, cold_model.intercepts_
    #[array([[-0.60819342]])] [array([-1.21256186])]
    #after second run of .fit(), values are completely different
    #because they were re-initialised before doing the second run for the cold model
    
    warm_model.fit(x,y)
    print warm_model.coefs_, warm_model.intercepts_
    #[array([[-1.39815616]])] [array([1.651504])]
    warm_model.fit(x,y)
    print warm_model.coefs_, warm_model.intercepts_
    #[array([[-1.39715616]])] [array([1.652504])]
    #this time with the warm model, params change relatively little, as params were
    #not re-initialised during second call to .fit()
    
    cold_model.partial_fit(x,y)
    print cold_model.coefs_, cold_model.intercepts_
    #[array([[-0.60719343]])] [array([-1.21156187])]
    cold_model.partial_fit(x,y)
    print cold_model.coefs_, cold_model.intercepts_
    #[array([[-0.60619347]])] [array([-1.21056189])]
    #with partial_fit(), params barely change even for cold model,
    #as no re-initialisation occurs
    
    warm_model.partial_fit(x,y)
    print warm_model.coefs_, warm_model.intercepts_
    #[array([[-1.39615617]])] [array([1.65350392])]
    warm_model.partial_fit(x,y)
    print warm_model.coefs_, warm_model.intercepts_
    #[array([[-1.39515619]])] [array([1.65450372])]
    #and of course the same goes for the warm model
    
    0 讨论(0)
提交回复
热议问题