pd.rolling_mean becoming deprecated - alternatives for ndarrays

13,302

Solution 1

EDIT -- Unfortunately, it looks like the new way is not nearly as fast:

New version of Pandas:

In [1]: x = np.random.uniform(size=100)

In [2]: %timeit pd.rolling_mean(x, window=2)
1000 loops, best of 3: 240 µs per loop

In [3]: %timeit pd.Series(x).rolling(window=2).mean()
1000 loops, best of 3: 226 µs per loop

In [4]: pd.__version__
Out[4]: '0.18.0'

Old version:

In [1]: x = np.random.uniform(size=100)

In [2]: %timeit pd.rolling_mean(x,window=2)
100000 loops, best of 3: 12.4 µs per loop

In [3]: pd.__version__
Out[3]: u'0.17.1'

Solution 2

Looks like the new way is via methods on the DataFrame.rolling class (I guess you're meant to think of it sort of like a groupby): http://pandas.pydata.org/pandas-docs/version/0.18.0/whatsnew.html

e.g.

x.rolling(window=2).mean()

Solution 3

try this

x.rolling(window=2, center=False).mean()
Share:
13,302
saladi
Author by

saladi

Updated on June 07, 2022

Comments

  • saladi
    saladi almost 2 years

    EDIT: This question was asked in 2016 and similar questions have been posted on SO years later after the functionality was finally removed, e.g. module 'pandas' has no attribute 'rolling_mean'

    However, the question concerns performance of the new pd.rolling.mean() and should stay open until the associated pandas issue is fixed.


    It looks like pd.rolling_mean is becoming deprecated for ndarrays,

     pd.rolling_mean(x, window=2, center=False)
    

    FutureWarning: pd.rolling_mean is deprecated for ndarrays and will be removed in a future version

    but it seems to be the fastest way of doing this, according to this SO answer.

    Are there now new ways of doing this directly with SciPy or NumPy that are as fast as pd.rolling_mean?

  • saladi
    saladi about 8 years
    Yeah, I realized that. Should've included this in the question. In any case, it turns out it's just as fast even though it requires explicitly turning x into a pd.Series first (See my answer with details).
  • saladi
    saladi about 8 years
    good point and it looks like you're right. See my edit. I'm going to open the question up again to see if anyone else has a solution here that retains the older speed.
  • maxymoo
    maxymoo about 8 years
    dang yeah that sucks !
  • Jeff
    Jeff about 8 years
    See here: this should only add a tiny bit of function call overhead, but this has an uncessary copy of the internal blocks, easy fix: github.com/pydata/pandas/issues/12732
  • Merlin
    Merlin almost 8 years
    This is horrible syntax... we went from simple and terse, to something verbose and unpythonic.
  • Contango
    Contango almost 7 years
    I almost agree - but the new syntax means that we can apply any function to that window, not just the precanned ones.
  • Prvt_Yadav
    Prvt_Yadav about 5 years
    How can I use pd.series(x), for 3D array. Here x is 3D numpy array.
  • Prvt_Yadav
    Prvt_Yadav about 5 years
    How can I use pd.series(x), for 3D array. Here x is 3D numpy array.
  • Prvt_Yadav
    Prvt_Yadav about 5 years
    How can I use pd.series(x), for 3D array. Here x is 3D numpy array.
  • Prvt_Yadav
    Prvt_Yadav about 5 years
    How can I use pd.series(x), for 3D array. Here x is 3D numpy array.