convert a column in a python pandas from STRING MONTH into INT

前端 未结 2 752
你的背包
你的背包 2021-01-05 15:56

In Python 2.7.11 & Pandas 0.18.1:

If we have the following csv file:

YEAR,MONTH,ID
2011,JAN,1
2011,FEB,1
2011,MAR,1

Is there an

相关标签:
2条回答
  • 2021-01-05 16:16

    Following Max's last point; create the same thing but rely on your local dataframe's way of encoding months:

    # create mapping
    d = dict((v,k) for k,v in zip(range(1, 13), df.Month.unique()))
    # create column
    df['month_index'] = df['Month'].map(d)
    
    0 讨论(0)
  • 2021-01-05 16:22

    I guess the easiest and one of the fastest method would be to create a mapping dict and map like as follows:

    In [2]: df
    Out[2]:
       YEAR MONTH  ID
    0  2011   JAN   1
    1  2011   FEB   1
    2  2011   MAR   1
    
    In [3]: d = {'JAN':1, 'FEB':2, 'MAR':3, 'APR':4, }
    
    In [4]: df.MONTH = df.MONTH.map(d)
    
    In [5]: df
    Out[5]:
       YEAR  MONTH  ID
    0  2011      1   1
    1  2011      2   1
    2  2011      3   1
    

    you may want to use df.MONTH = df.MONTH.str.upper().map(d) if not all MONTH values are in upper case

    another more slower but more robust method:

    In [11]: pd.to_datetime(df.MONTH, format='%b').dt.month
    Out[11]:
    0    1
    1    2
    2    3
    Name: MONTH, dtype: int64
    

    UPDATE: we can create a mapping automatically (thanks to @Quetzalcoatl)

    import calendar
    
    d = dict((v,k) for k,v in enumerate(calendar.month_abbr))
    

    or alternatively (using only Pandas):

    d = dict(zip(range(1,13), pd.date_range('2000-01-01', freq='M', periods=12).strftime('%b')))
    
    0 讨论(0)
提交回复
热议问题