Pandas: dataframe to long format

匿名 (未验证) 提交于 2019-12-03 01:34:02

问题:

I have the following df:

tz.head()      state   2004    2005    2006    2007    2008    2009    2010    2011    2012    2013    2014    2015 0   AL      5.7     4.5     4.0     4.0     5.7     11.0    10.5    9.6     8.0     7.2     6.8     6.1 1   AK      7.5     6.9     6.6     6.3     6.7     7.7     7.9     7.6     7.1     6.9     6.9     6.5 2   AZ      5.0     4.7     4.2     3.9     6.2     9.9     10.4    9.5     8.3     7.7     6.8     6.1 3   AR      5.7     5.2     5.2     5.3     5.5     7.8     8.2     8.3     7.6     7.3     6.1     5.2 4   CA      6.2     5.4     4.9     5.4     7.3     11.2    12.2    11.7    10.4    8.9     7.5     6.2 

I would like to change it so that looks like this:

year    state   unemployment 2004    AL  5.7 2005    AL  4.5 2006    AL  4.0 2007    AL  4.0 2008    AL  5.7 2009    AL  11.0 2010    AL  10.5 2011    AL  9.6 2012    AL  8.0 2013    AL  7.2 2014    AL  6.8 2015    AL  6.1 2004    AK  7.5 2005    AK  6.9 2006    AK  6.6 2007    AK  6.3 2008    AK  6.7 2009    AK  7.7 2010    AK  7.9 2011    AK  7.6 2012    AK  7.1 2013    AK  6.9 2014    AK  6.9 2015    AK  6.5 

The reason is that I have a df that is similarly shaped and I need to merge the two dfs. I have recently had similar df shaping issues that I have been unable to find simple quick solutions to with python. Does anyone know how to solve this kind of problem?

回答1:

You can use melt:

print pd.melt(df,id_vars=['state'],var_name='year', value_name='unemployment')     state  year  unemployment 0     AL  2004           5.7 1     AK  2004           7.5 2     AZ  2004           5.0 3     AR  2004           5.7 4     CA  2004           6.2 5     AL  2005           4.5 6     AK  2005           6.9 7     AZ  2005           4.7 8     AR  2005           5.2 9     CA  2005           5.4 10    AL  2006           4.0 11    AK  2006           6.6 12    AZ  2006           4.2 13    AR  2006           5.2 14    CA  2006           4.9 15    AL  2007           4.0 16    AK  2007           6.3 17    AZ  2007           3.9 18    AR  2007           5.3 19    CA  2007           5.4 20    AL  2008           5.7 21    AK  2008           6.7 22    AZ  2008           6.2 23    AR  2008           5.5 24    CA  2008           7.3 25    AL  2009          11.0 26    AK  2009           7.7 27    AZ  2009           9.9 28    AR  2009           7.8 29    CA  2009          11.2 30    AL  2010          10.5 31    AK  2010           7.9 32    AZ  2010          10.4 33    AR  2010           8.2 34    CA  2010          12.2 35    AL  2011           9.6 36    AK  2011           7.6 37    AZ  2011           9.5 38    AR  2011           8.3 39    CA  2011          11.7 40    AL  2012           8.0 41    AK  2012           7.1 42    AZ  2012           8.3 43    AR  2012           7.6 44    CA  2012          10.4 45    AL  2013           7.2 46    AK  2013           6.9 47    AZ  2013           7.7 48    AR  2013           7.3 49    CA  2013           8.9 50    AL  2014           6.8 51    AK  2014           6.9 52    AZ  2014           6.8 53    AR  2014           6.1 54    CA  2014           7.5 55    AL  2015           6.1 56    AK  2015           6.5 57    AZ  2015           6.1 58    AR  2015           5.2 59    CA  2015           6.2 


回答2:

Besides melt, pandas also provides wide_to_long, which is "Less flexible but more user-friendly than melt." according to the docs.

# Adding prefixes here to get nice column names afterwards df = df.add_prefix('unemployment') df.rename(columns={'unemploymentstate': 'state'}, inplace=True)  # Reshaping to long/tidy format pd.wide_to_long(df, stubnames='unemployment', i='state', j='year')          unemployment state   year     AL  2004    5.7 AK  2004    7.5 AZ  2004    5.0 AR  2004    5.7 CA  2004    6.2 AL  2005    4.5 ... 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!