NumPy version of “Exponential weighted moving average”, equivalent to pandas.ewm().mean()

后端 未结 12 709
一生所求
一生所求 2020-11-27 12:30

How do I get the exponential weighted moving average in NumPy just like the following in pandas?

import pandas as pd
import pandas_datareader as pdr
from dat         


        
相关标签:
12条回答
  • 2020-11-27 12:40

    Fastest EWMA 23x pandas

    The question is strictly asking for a numpy solution, however, it seems that the OP was actually just after a pure numpy solution to speed up runtime.

    I solved a similar problem but instead looked towards numba.jit which massively speeds the compute time

    In [24]: a = np.random.random(10**7)
        ...: df = pd.Series(a)
    In [25]: %timeit numpy_ewma(a, 10)               # /a/42915307/4013571
        ...: %timeit df.ewm(span=10).mean()          # pandas
        ...: %timeit numpy_ewma_vectorized_v2(a, 10) # best w/o numba: /a/42926270/4013571
        ...: %timeit _ewma(a, 10)                    # fastest accurate (below)
        ...: %timeit _ewma_infinite_hist(a, 10)      # fastest overall (below)
    4.14 s ± 116 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    991 ms ± 52.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 
    396 ms ± 8.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    181 ms ± 1.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)   
    39.6 ms ± 979 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
    

    Scaling down to smaller arrays of a = np.random.random(100) (results in the same order)

    41.6 µs ± 491 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    945 ms ± 12 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    16 µs ± 93.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    1.66 µs ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
    1.14 µs ± 5.57 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
    

    It is also worth pointing out that my functions below are identically aligned to the pandas (see the examples in docstr), whereas a few of the answers here take various different approximations. For example,

    In [57]: print(pd.DataFrame([1,2,3]).ewm(span=2).mean().values.ravel())
        ...: print(numpy_ewma_vectorized_v2(np.array([1,2,3]), 2))
        ...: print(numpy_ewma(np.array([1,2,3]), 2))
    [1.         1.75       2.61538462]
    [1.         1.66666667 2.55555556]
    [1.         1.18181818 1.51239669]
    

    The source code which I have documented for my own library

    import numpy as np
    from numba import jit
    from numba import float64
    from numba import int64
    
    
    @jit((float64[:], int64), nopython=True, nogil=True)
    def _ewma(arr_in, window):
        r"""Exponentialy weighted moving average specified by a decay ``window``
        to provide better adjustments for small windows via:
    
            y[t] = (x[t] + (1-a)*x[t-1] + (1-a)^2*x[t-2] + ... + (1-a)^n*x[t-n]) /
                   (1 + (1-a) + (1-a)^2 + ... + (1-a)^n).
    
        Parameters
        ----------
        arr_in : np.ndarray, float64
            A single dimenisional numpy array
        window : int64
            The decay window, or 'span'
    
        Returns
        -------
        np.ndarray
            The EWMA vector, same length / shape as ``arr_in``
    
        Examples
        --------
        >>> import pandas as pd
        >>> a = np.arange(5, dtype=float)
        >>> exp = pd.DataFrame(a).ewm(span=10, adjust=True).mean()
        >>> np.array_equal(_ewma_infinite_hist(a, 10), exp.values.ravel())
        True
        """
        n = arr_in.shape[0]
        ewma = np.empty(n, dtype=float64)
        alpha = 2 / float(window + 1)
        w = 1
        ewma_old = arr_in[0]
        ewma[0] = ewma_old
        for i in range(1, n):
            w += (1-alpha)**i
            ewma_old = ewma_old*(1-alpha) + arr_in[i]
            ewma[i] = ewma_old / w
        return ewma
    
    
    @jit((float64[:], int64), nopython=True, nogil=True)
    def _ewma_infinite_hist(arr_in, window):
        r"""Exponentialy weighted moving average specified by a decay ``window``
        assuming infinite history via the recursive form:
    
            (2) (i)  y[0] = x[0]; and
                (ii) y[t] = a*x[t] + (1-a)*y[t-1] for t>0.
    
        This method is less accurate that ``_ewma`` but
        much faster:
    
            In [1]: import numpy as np, bars
               ...: arr = np.random.random(100000)
               ...: %timeit bars._ewma(arr, 10)
               ...: %timeit bars._ewma_infinite_hist(arr, 10)
            3.74 ms ± 60.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
            262 µs ± 1.54 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
        Parameters
        ----------
        arr_in : np.ndarray, float64
            A single dimenisional numpy array
        window : int64
            The decay window, or 'span'
    
        Returns
        -------
        np.ndarray
            The EWMA vector, same length / shape as ``arr_in``
    
        Examples
        --------
        >>> import pandas as pd
        >>> a = np.arange(5, dtype=float)
        >>> exp = pd.DataFrame(a).ewm(span=10, adjust=False).mean()
        >>> np.array_equal(_ewma_infinite_hist(a, 10), exp.values.ravel())
        True
        """
        n = arr_in.shape[0]
        ewma = np.empty(n, dtype=float64)
        alpha = 2 / float(window + 1)
        ewma[0] = arr_in[0]
        for i in range(1, n):
            ewma[i] = arr_in[i] * alpha + ewma[i-1] * (1 - alpha)
        return ewma
    
    0 讨论(0)
  • 2020-11-27 12:41

    Building on top of Divakar's great answer, here is an implementation which corresponds to the adjust=True flag of the pandas function, i.e. using weights rather than recursion.

    def numpy_ewma(data, window):
        alpha = 2 /(window + 1.0)
        scale = 1/(1-alpha)
        n = data.shape[0]
        scale_arr = (1-alpha)**(-1*np.arange(n))
        weights = (1-alpha)**np.arange(n)
        pw0 = (1-alpha)**(n-1)
        mult = data*pw0*scale_arr
        cumsums = mult.cumsum()
        out = cumsums*scale_arr[::-1] / weights.cumsum()
    
        return out
    
    0 讨论(0)
  • 2020-11-27 12:44

    Updated 08/06/2019

    PURE NUMPY, FAST & VECTORIZED SOLUTION FOR LARGE INPUTS

    out parameter for in-place computation, dtype parameter, index order parameter

    This function is equivalent to pandas' ewm(adjust=False).mean(), but much faster. ewm(adjust=True).mean() (the default for pandas) can produce different values at the start of the result. I am working to add the adjust functionality to this solution.

    @Divakar's answer leads to floating point precision problems when the input is too large. This is because (1-alpha)**(n+1) -> 0 when n -> inf and alpha -> 1, leading to divide-by-zero's and NaN values popping up in the calculation.

    Here is my fastest solution with no precision problems, nearly fully vectorized. It's gotten a little complicated but the performance is great, especially for really huge inputs. Without using in-place calculations (which is possible using the out parameter, saving memory allocation time): 3.62 seconds for 100M element input vector, 3.2ms for a 100K element input vector, and 293µs for a 5000 element input vector on a pretty old PC (results will vary with different alpha/row_size values).

    # tested with python3 & numpy 1.15.2
    import numpy as np
    
    def ewma_vectorized_safe(data, alpha, row_size=None, dtype=None, order='C', out=None):
        """
        Reshapes data before calculating EWMA, then iterates once over the rows
        to calculate the offset without precision issues
        :param data: Input data, will be flattened.
        :param alpha: scalar float in range (0,1)
            The alpha parameter for the moving average.
        :param row_size: int, optional
            The row size to use in the computation. High row sizes need higher precision,
            low values will impact performance. The optimal value depends on the
            platform and the alpha being used. Higher alpha values require lower
            row size. Default depends on dtype.
        :param dtype: optional
            Data type used for calculations. Defaults to float64 unless
            data.dtype is float32, then it will use float32.
        :param order: {'C', 'F', 'A'}, optional
            Order to use when flattening the data. Defaults to 'C'.
        :param out: ndarray, or None, optional
            A location into which the result is stored. If provided, it must have
            the same shape as the desired output. If not provided or `None`,
            a freshly-allocated array is returned.
        :return: The flattened result.
        """
        data = np.array(data, copy=False)
    
        if dtype is None:
            if data.dtype == np.float32:
                dtype = np.float32
            else:
                dtype = np.float
        else:
            dtype = np.dtype(dtype)
    
        row_size = int(row_size) if row_size is not None 
                   else get_max_row_size(alpha, dtype)
    
        if data.size <= row_size:
            # The normal function can handle this input, use that
            return ewma_vectorized(data, alpha, dtype=dtype, order=order, out=out)
    
        if data.ndim > 1:
            # flatten input
            data = np.reshape(data, -1, order=order)
    
        if out is None:
            out = np.empty_like(data, dtype=dtype)
        else:
            assert out.shape == data.shape
            assert out.dtype == dtype
    
        row_n = int(data.size // row_size)  # the number of rows to use
        trailing_n = int(data.size % row_size)  # the amount of data leftover
        first_offset = data[0]
    
        if trailing_n > 0:
            # set temporary results to slice view of out parameter
            out_main_view = np.reshape(out[:-trailing_n], (row_n, row_size))
            data_main_view = np.reshape(data[:-trailing_n], (row_n, row_size))
        else:
            out_main_view = out
            data_main_view = data
    
        # get all the scaled cumulative sums with 0 offset
        ewma_vectorized_2d(data_main_view, alpha, axis=1, offset=0, dtype=dtype,
                           order='C', out=out_main_view)
    
        scaling_factors = (1 - alpha) ** np.arange(1, row_size + 1)
        last_scaling_factor = scaling_factors[-1]
    
        # create offset array
        offsets = np.empty(out_main_view.shape[0], dtype=dtype)
        offsets[0] = first_offset
        # iteratively calculate offset for each row
        for i in range(1, out_main_view.shape[0]):
            offsets[i] = offsets[i - 1] * last_scaling_factor + out_main_view[i - 1, -1]
    
        # add the offsets to the result
        out_main_view += offsets[:, np.newaxis] * scaling_factors[np.newaxis, :]
    
        if trailing_n > 0:
            # process trailing data in the 2nd slice of the out parameter
            ewma_vectorized(data[-trailing_n:], alpha, offset=out_main_view[-1, -1],
                            dtype=dtype, order='C', out=out[-trailing_n:])
        return out
    
    def get_max_row_size(alpha, dtype=float):
        assert 0. <= alpha < 1.
        # This will return the maximum row size possible on 
        # your platform for the given dtype. I can find no impact on accuracy
        # at this value on my machine.
        # Might not be the optimal value for speed, which is hard to predict
        # due to numpy's optimizations
        # Use np.finfo(dtype).eps if you  are worried about accuracy
        # and want to be extra safe.
        epsilon = np.finfo(dtype).tiny
        # If this produces an OverflowError, make epsilon larger
        return int(np.log(epsilon)/np.log(1-alpha)) + 1
    

    The 1D ewma function:

    def ewma_vectorized(data, alpha, offset=None, dtype=None, order='C', out=None):
        """
        Calculates the exponential moving average over a vector.
        Will fail for large inputs.
        :param data: Input data
        :param alpha: scalar float in range (0,1)
            The alpha parameter for the moving average.
        :param offset: optional
            The offset for the moving average, scalar. Defaults to data[0].
        :param dtype: optional
            Data type used for calculations. Defaults to float64 unless
            data.dtype is float32, then it will use float32.
        :param order: {'C', 'F', 'A'}, optional
            Order to use when flattening the data. Defaults to 'C'.
        :param out: ndarray, or None, optional
            A location into which the result is stored. If provided, it must have
            the same shape as the input. If not provided or `None`,
            a freshly-allocated array is returned.
        """
        data = np.array(data, copy=False)
    
        if dtype is None:
            if data.dtype == np.float32:
                dtype = np.float32
            else:
                dtype = np.float64
        else:
            dtype = np.dtype(dtype)
    
        if data.ndim > 1:
            # flatten input
            data = data.reshape(-1, order)
    
        if out is None:
            out = np.empty_like(data, dtype=dtype)
        else:
            assert out.shape == data.shape
            assert out.dtype == dtype
    
        if data.size < 1:
            # empty input, return empty array
            return out
    
        if offset is None:
            offset = data[0]
    
        alpha = np.array(alpha, copy=False).astype(dtype, copy=False)
    
        # scaling_factors -> 0 as len(data) gets large
        # this leads to divide-by-zeros below
        scaling_factors = np.power(1. - alpha, np.arange(data.size + 1, dtype=dtype),
                                   dtype=dtype)
        # create cumulative sum array
        np.multiply(data, (alpha * scaling_factors[-2]) / scaling_factors[:-1],
                    dtype=dtype, out=out)
        np.cumsum(out, dtype=dtype, out=out)
    
        # cumsums / scaling
        out /= scaling_factors[-2::-1]
    
        if offset != 0:
            offset = np.array(offset, copy=False).astype(dtype, copy=False)
            # add offsets
            out += offset * scaling_factors[1:]
    
        return out
    

    The 2D ewma function:

    def ewma_vectorized_2d(data, alpha, axis=None, offset=None, dtype=None, order='C', out=None):
        """
        Calculates the exponential moving average over a given axis.
        :param data: Input data, must be 1D or 2D array.
        :param alpha: scalar float in range (0,1)
            The alpha parameter for the moving average.
        :param axis: The axis to apply the moving average on.
            If axis==None, the data is flattened.
        :param offset: optional
            The offset for the moving average. Must be scalar or a
            vector with one element for each row of data. If set to None,
            defaults to the first value of each row.
        :param dtype: optional
            Data type used for calculations. Defaults to float64 unless
            data.dtype is float32, then it will use float32.
        :param order: {'C', 'F', 'A'}, optional
            Order to use when flattening the data. Ignored if axis is not None.
        :param out: ndarray, or None, optional
            A location into which the result is stored. If provided, it must have
            the same shape as the desired output. If not provided or `None`,
            a freshly-allocated array is returned.
        """
        data = np.array(data, copy=False)
    
        assert data.ndim <= 2
    
        if dtype is None:
            if data.dtype == np.float32:
                dtype = np.float32
            else:
                dtype = np.float64
        else:
            dtype = np.dtype(dtype)
    
        if out is None:
            out = np.empty_like(data, dtype=dtype)
        else:
            assert out.shape == data.shape
            assert out.dtype == dtype
    
        if data.size < 1:
            # empty input, return empty array
            return out
    
        if axis is None or data.ndim < 2:
            # use 1D version
            if isinstance(offset, np.ndarray):
                offset = offset[0]
            return ewma_vectorized(data, alpha, offset, dtype=dtype, order=order,
                                   out=out)
    
        assert -data.ndim <= axis < data.ndim
    
        # create reshaped data views
        out_view = out
        if axis < 0:
            axis = data.ndim - int(axis)
    
        if axis == 0:
            # transpose data views so columns are treated as rows
            data = data.T
            out_view = out_view.T
    
        if offset is None:
            # use the first element of each row as the offset
            offset = np.copy(data[:, 0])
        elif np.size(offset) == 1:
            offset = np.reshape(offset, (1,))
    
        alpha = np.array(alpha, copy=False).astype(dtype, copy=False)
    
        # calculate the moving average
        row_size = data.shape[1]
        row_n = data.shape[0]
        scaling_factors = np.power(1. - alpha, np.arange(row_size + 1, dtype=dtype),
                                   dtype=dtype)
        # create a scaled cumulative sum array
        np.multiply(
            data,
            np.multiply(alpha * scaling_factors[-2], np.ones((row_n, 1), dtype=dtype),
                        dtype=dtype)
            / scaling_factors[np.newaxis, :-1],
            dtype=dtype, out=out_view
        )
        np.cumsum(out_view, axis=1, dtype=dtype, out=out_view)
        out_view /= scaling_factors[np.newaxis, -2::-1]
    
        if not (np.size(offset) == 1 and offset == 0):
            offset = offset.astype(dtype, copy=False)
            # add the offsets to the scaled cumulative sums
            out_view += offset[:, np.newaxis] * scaling_factors[np.newaxis, 1:]
    
        return out
    

    usage:

    data_n = 100000000
    data = ((0.5*np.random.randn(data_n)+0.5) % 1) * 100
    
    span = 5000  # span >= 1
    alpha = 2/(span+1)  # for pandas` span parameter
    
    # com = 1000  # com >= 0
    # alpha = 1/(1+com)  # for pandas` center-of-mass parameter
    
    # halflife = 100  # halflife > 0
    # alpha = 1 - np.exp(np.log(0.5)/halflife)  # for pandas` half-life parameter
    
    result = ewma_vectorized_safe(data, alpha)
    

    Just a tip

    It is easy to calculate a 'window size' (technically exponential averages have infinite 'windows') for a given alpha, dependent on the contribution of the data in that window to the average. This is useful for example to chose how much of the start of the result to treat as unreliable due to border effects.

    def window_size(alpha, sum_proportion):
        # Increases with increased sum_proportion and decreased alpha
        # solve (1-alpha)**window_size = (1-sum_proportion) for window_size        
        return int(np.log(1-sum_proportion) / np.log(1-alpha))
    
    alpha = 0.02
    sum_proportion = .99  # window covers 99% of contribution to the moving average
    window = window_size(alpha, sum_proportion)  # = 227
    sum_proportion = .75  # window covers 75% of contribution to the moving average
    window = window_size(alpha, sum_proportion)  # = 68
    

    The alpha = 2 / (window_size + 1.0) relation used in this thread (the 'span' option from pandas) is a very rough approximation of the inverse of the above function (with sum_proportion~=0.87). alpha = 1 - np.exp(np.log(1-sum_proportion)/window_size) is more accurate (the 'half-life' option from pandas equals this formula with sum_proportion=0.5).

    In the following example, data represents a continuous noisy signal. cutoff_idx is the first position in result where at least 99% of the value is dependent on separate values in data (i.e. less than 1% depends on data[0]). The data up to cutoff_idx is excluded from the final results because it is too dependent on the first value in data, therefore possibly skewing the average.

    result = ewma_vectorized_safe(data, alpha, chunk_size)
    sum_proportion = .99
    cutoff_idx = window_size(alpha, sum_proportion)
    result = result[cutoff_idx:]
    

    To illustrate the problem the above solve you can run this a few times, notice the often-appearing false start of the red line, which is skipped after cutoff_idx:

    data_n = 100000
    data = np.random.rand(data_n) * 100
    window = 1000
    sum_proportion = .99
    alpha = 1 - np.exp(np.log(1-sum_proportion)/window)
    
    result = ewma_vectorized_safe(data, alpha)
    
    cutoff_idx = window_size(alpha, sum_proportion)
    x = np.arange(start=0, stop=result.size)
    
    import matplotlib.pyplot as plt
    plt.plot(x[:cutoff_idx+1], result[:cutoff_idx+1], '-r',
             x[cutoff_idx:], result[cutoff_idx:], '-b')
    plt.show()
    

    note that cutoff_idx==window because alpha was set with the inverse of the window_size() function, with the same sum_proportion. This is similar to how pandas applies ewm(span=window, min_periods=window).

    0 讨论(0)
  • 2020-11-27 12:44

    Here's my implementation for 1D input arrays with infinite window size. As it uses large numbers, it works only with input arrays with elements of absolute value < 1e16, when using float32, but that should normally be the case.

    The idea is to reshape the input array into slices of a limited length, so that no overflow occurs, and then doing the ewm calculation with each slice separately.

    def ewm(x, alpha):
        """
        Returns the exponentially weighted mean y of a numpy array x with scaling factor alpha
        y[0] = x[0]
        y[j] = (1. - alpha) * y[j-1] + alpha * x[j],  for j > 0
    
        x -- 1D numpy array
        alpha -- float
        """
    
        n = int(-100. / np.log(1.-alpha)) # Makes sure that the first and last elements in f are very big and very small (about 1e22 and 1e-22)
        f = np.exp(np.arange(1-n, n, 2) * (0.5 * np.log(1. - alpha))) # Scaling factor for each slice
        tmp = (np.resize(x, ((len(x) + n - 1) // n, n)) / f * alpha).cumsum(axis=1) * f # Get ewm for each slice of length n
    
        # Add the last value of each previous slice to the next slice with corresponding scaling factor f and return result
        return np.resize(tmp + np.tensordot(np.append(x[0], np.roll(tmp.T[n-1], 1)[1:]), f * ((1. - alpha) / f[0]), axes=0), len(x))
    
    0 讨论(0)
  • 2020-11-27 12:45

    I think I have finally cracked it!

    Here's a vectorized version of numpy_ewma function that's claimed to be producing the correct results from @RaduS's post -

    def numpy_ewma_vectorized(data, window):
    
        alpha = 2 /(window + 1.0)
        alpha_rev = 1-alpha
    
        scale = 1/alpha_rev
        n = data.shape[0]
    
        r = np.arange(n)
        scale_arr = scale**r
        offset = data[0]*alpha_rev**(r+1)
        pw0 = alpha*alpha_rev**(n-1)
    
        mult = data*pw0*scale_arr
        cumsums = mult.cumsum()
        out = offset + cumsums*scale_arr[::-1]
        return out
    

    Further boost

    We can boost it further with some code re-use, like so -

    def numpy_ewma_vectorized_v2(data, window):
    
        alpha = 2 /(window + 1.0)
        alpha_rev = 1-alpha
        n = data.shape[0]
    
        pows = alpha_rev**(np.arange(n+1))
    
        scale_arr = 1/pows[:-1]
        offset = data[0]*pows[1:]
        pw0 = alpha*alpha_rev**(n-1)
    
        mult = data*pw0*scale_arr
        cumsums = mult.cumsum()
        out = offset + cumsums*scale_arr[::-1]
        return out
    

    Runtime test

    Let's time these two against the same loopy function for a big dataset.

    In [97]: data = np.random.randint(2,9,(5000))
        ...: window = 20
        ...:
    
    In [98]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized(data, window))
    Out[98]: True
    
    In [99]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized_v2(data, window))
    Out[99]: True
    
    In [100]: %timeit numpy_ewma(data, window)
    100 loops, best of 3: 6.03 ms per loop
    
    In [101]: %timeit numpy_ewma_vectorized(data, window)
    1000 loops, best of 3: 665 µs per loop
    
    In [102]: %timeit numpy_ewma_vectorized_v2(data, window)
    1000 loops, best of 3: 357 µs per loop
    
    In [103]: 6030/357.0
    Out[103]: 16.89075630252101
    

    There is around a 17 times speedup!

    0 讨论(0)
  • 2020-11-27 12:46

    Thanks to @Divakar's solution and that is really fast. However, it does cause overflow problem which was pointed out by @Danny. The function doesn't return correct answers when the length is greater than 13835 or so at my end.

    The following is my solution based on Divakar's solution and pandas.ewm().mean()

    def numpy_ema(data, com=None, span=None, halflife=None, alpha=None):
    """Summary
    Calculate ema with automatically-generated alpha. Weight of past effect
    decreases as the length of window increasing.
    
    # these functions reproduce the pandas result when the flag adjust=False is set.
    References:
    https://stackoverflow.com/questions/42869495/numpy-version-of-exponential-weighted-moving-average-equivalent-to-pandas-ewm
    
    Args:
        data (TYPE): Description
        com (float, optional): Specify decay in terms of center of mass, alpha=1/(1+com), for com>=0
        span (float, optional): Specify decay in terms of span, alpha=2/(span+1), for span>=1
        halflife (float, optional): Specify decay in terms of half-life, alpha=1-exp(log(0.5)/halflife), for halflife>0
        alpha (float, optional): Specify smoothing factor alpha directly, 0<alpha<=1
    
    Returns:
        TYPE: Description
    
    Raises:
        ValueError: Description
    """
    n_input = sum(map(bool, [com, span, halflife, alpha]))
    if n_input != 1:
        raise ValueError(
            'com, span, halflife, and alpha are mutually exclusive')
    
    nrow = data.shape[0]
    if np.isnan(data).any() or (nrow > 13835) or (data.ndim == 2):
        df = pd.DataFrame(data)
        df_ewm = df.ewm(com=com, span=span, halflife=halflife,
                        alpha=alpha, adjust=False)
        out = df_ewm.mean().values.squeeze()
    else:
        if com:
            alpha = 1 / (1 + com)
        elif span:
            alpha = 2 / (span + 1.0)
        elif halflife:
            alpha = 1 - np.exp(np.log(0.5) / halflife)
    
        alpha_rev = 1 - alpha
        pows = alpha_rev**(np.arange(nrow + 1))
    
        scale_arr = 1 / pows[:-1]
        offset = data[0] * pows[1:]
        pw0 = alpha * alpha_rev**(nrow - 1)
    
        mult = data * pw0 * scale_arr
    
        cumsums = np.cumsum(mult)
        out = offset + cumsums * scale_arr[::-1]
    return out
    
    0 讨论(0)
提交回复
热议问题