Rolling Mean on pandas on a specific column

前端 未结 3 1747
花落未央
花落未央 2020-11-27 03:19

I have a data frame like this which is imported from a CSV.

              stock  pop
Date
2016-01-04  325.316   82
2016-01-11  320.036   83
2016-01-18  299.1         


        
相关标签:
3条回答
  • 2020-11-27 03:55

    To assign a column, you can create a rolling object based on your Series:

    df['new_col'] = data['column'].rolling(5).mean()
    

    The answer posted by ac2001 is not the most performant way of doing this. He is calculating a rolling mean on every column in the dataframe, then he is assigning the "ma" column using the "pop" column. The first method of the following is much more efficient:

    %timeit df['ma'] = data['pop'].rolling(5).mean()
    %timeit df['ma_2'] = data.rolling(5).mean()['pop']
    
    1000 loops, best of 3: 497 µs per loop
    100 loops, best of 3: 2.6 ms per loop
    

    I would not recommend using the second method unless you need to store computed rolling means on all other columns.

    0 讨论(0)
  • 2020-11-27 03:56

    This solution worked for me.

    data['MA'] = data.rolling(5).mean()['pop']
    

    I think the issue may be that the on='pop' is just changing the column to perform the rolling window from the index.

    From the doc string: " For a DataFrame, column on which to calculate the rolling window, rather than the index"

    0 讨论(0)
  • 2020-11-27 04:12

    Edit: pd.rolling_mean is deprecated in pandas and will be removed in future. Instead: Using pd.rolling you can do:

    df['MA'] = df['pop'].rolling(window=5,center=False).mean()
    

    for a dataframe df:

              Date    stock  pop
    0   2016-01-04  325.316   82
    1   2016-01-11  320.036   83
    2   2016-01-18  299.169   79
    3   2016-01-25  296.579   84
    4   2016-02-01  295.334   82
    5   2016-02-08  309.777   81
    6   2016-02-15  317.397   75
    7   2016-02-22  328.005   80
    8   2016-02-29  315.504   81
    9   2016-03-07  328.802   81
    

    To get:

              Date    stock  pop    MA
    0   2016-01-04  325.316   82   NaN
    1   2016-01-11  320.036   83   NaN
    2   2016-01-18  299.169   79   NaN
    3   2016-01-25  296.579   84   NaN
    4   2016-02-01  295.334   82  82.0
    5   2016-02-08  309.777   81  81.8
    6   2016-02-15  317.397   75  80.2
    7   2016-02-22  328.005   80  80.4
    8   2016-02-29  315.504   81  79.8
    9   2016-03-07  328.802   81  79.6
    

    Documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rolling.html

    Old: Although it is deprecated you can use:

    df['MA']=pd.rolling_mean(df['pop'], window=5)
    

    to get:

              Date    stock  pop    MA
    0   2016-01-04  325.316   82   NaN
    1   2016-01-11  320.036   83   NaN
    2   2016-01-18  299.169   79   NaN
    3   2016-01-25  296.579   84   NaN
    4   2016-02-01  295.334   82  82.0
    5   2016-02-08  309.777   81  81.8
    6   2016-02-15  317.397   75  80.2
    7   2016-02-22  328.005   80  80.4
    8   2016-02-29  315.504   81  79.8
    9   2016-03-07  328.802   81  79.6
    

    Documentation: http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.rolling_mean.html

    0 讨论(0)
提交回复
热议问题