Pandas series sort by month index

前端 未结 3 1385
情歌与酒
情歌与酒 2020-12-03 16:17



        
相关标签:
3条回答
  • 2020-12-03 16:57

    You can use sorted CategoricalIndex with sort_index:

    cats = ['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec']
    df.index = pd.CategoricalIndex(df.index, categories=cats, ordered=True)
    df = df.sort_index()
    
    print (df)
         date
    Jan     2
    Feb     1
    Apr     1
    May     1
    Jun     1
    Jul     1
    Aug     2
    Sep     2
    Oct    14
    Nov    36
    Dec    47
    

    Or use DataFrame.reindex - but if some value is missing add NaNs rows:

    df = df.reindex(cats)
    
    0 讨论(0)
  • 2020-12-03 17:08

    Adding to the very helpful answer by @jezrael:

    In pandas 0.25.1 sorted has been replaced by ordered per pandas.CategoricalIndex

    Old way:

    df.index = pd.CategoricalIndex(df.index, 
                                   categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'], 
                                   sorted=True)
    df = df.sort_index()
    

    Error

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-468-3f0ab66734d4> in <module>
          2 net.index = pd.CategoricalIndex(net.index, 
          3                                categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'],
    ----> 4                                sorted=True)
          5 net = net.sort_index()
          6 net
    
    TypeError: __new__() got an unexpected keyword argument 'sorted'
    

    New way:

    df.index = pd.CategoricalIndex(df.index, 
                                   categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'], 
                                   ordered=True)
    df = df.sort_index()
    
    0 讨论(0)
  • 2020-12-03 17:11

    Okay it was not very complex. I'm sure Categorical would have worked just that I was unable to solve the problem using Categorical. What I did was-

    1. Sort by month while months were being represented as integers
    2. To the resulting series applied a mapper on the index to convert the integer month into an abbreviated string

    I'm sure there are more efficient ways of solving this, so if you have a better way please post the same.

        import calendar
        months = release_dates[release_dates.title.str.contains('Christmas') & (release_dates.country=='USA')].date.dt.month
        counts = months.value_counts()
        counts.sort_index(inplace=True)
        counts.index = map(lambda x: calendar.month_abbr[x], counts.index)
        counts.plot.bar()

    0 讨论(0)
提交回复
热议问题