Python pandas linear regression groupby

前端 未结 3 1189
青春惊慌失措
青春惊慌失措 2021-01-31 11:48

I am trying to use a linear regression on a group by pandas python dataframe:

This is the dataframe df:

  group      date      value
    A     01-02-201         


        
3条回答
  •  借酒劲吻你
    2021-01-31 12:36

    This might be a late response but I post the answer anyway should someone encounters the same problem. Actually, everything that was shown was correct except for the regression block. Here are the two problems with the implementation:

    • Please note that the model.fit(X, y) gets an input X{array-like, sparse matrix} of shape (n_samples, n_features) for X. So both inputs for model.fit(X, y) should be 2D. You can easily convert the 1D series to 2D by the reshape(-1, 1) command.

    • The second problem is the regression fitting process itself: y and X are not the input of model = LinearRegression(y, X) but rather the input of `model.fit(X, y)'.

    Here is the modification to the regression block:

    for group in df_group.groups.keys():
          df= df_group.get_group(group)
          X = np.array(df[['date_delta']]).reshape(-1, 1) # note that series does not have reshape function, thus you need to convert to array
          y = np.array(df.value).reshape(-1, 1) 
          model = LinearRegression()  # <--- this does not accept (X, y)
          results = model.fit(X, y)
          print results.summary()
    

提交回复
热议问题