multi-index

How to “explicitly specify the categories order by passing in a categories argument” when using tuples as index keys in pandas?

那年仲夏 提交于 2019-12-23 12:16:40
问题 I've been trying to figure out how to make these tuples index keys in pandas but I'm getting an error. How can I use the suggestion from the error with pd.Categorical below to fix this error? I am aware that I can convert to a string but I am curious to see what is meant by the suggestion in the error message? This works perfectly fine when I run it with 0.22.0 . I've opened a GitHub issue for this if anyone wants to see the proper output from 0.22.0 . I want to update my pandas and handle

Pandas Add Header Row for MultiIndex

丶灬走出姿态 提交于 2019-12-23 01:18:49
问题 Given the following data frame: d2=pd.DataFrame({'Item':['y','y','z','x'], 'other':['aa','bb','cc','dd']}) d2 Item other 0 y aa 1 y bb 2 z cc 3 x dd I'd like to add a row to the top and then use that as level 1 of a multiIndexed header. I can't always predict how many columns the data frame will have, so the new row should allow for that (i.e. random characters or numbers are okay). I'm looking for something like this: Item other A B 0 y aa 1 y bb 2 z cc 3 x dd But again, the number of

Pandas Multiindex dataframe remove rows

点点圈 提交于 2019-12-22 18:09:58
问题 I have Multiiindex DF as follows: tuples = list(zip(*[['a', 'a', 'b', 'b'], ['c', 'd', 'c', 'd']])) index = pd.MultiIndex.from_tuples(tuples, names=['i1', 'i2']) df = pd.DataFrame([5, 6, 7, 8], index=index[:4], columns=['col']) col i1 i2 a c 5 d 6 b c 7 d 8 Would like to keep rows whose index (level 0) is in idx_to_keep = ['a'] Should be a straightforward task, but I can't think of any other way than idx_to_drop = np.setdiff1d(pd.unique(df.index.levels[0]), idx_to_keep) df.drop(idx_to_drop,

How to build a MultiIndex Pandas DataFrame from a nested dictionary with lists

痞子三分冷 提交于 2019-12-22 10:02:01
问题 I have the following dictionary. d= {'key1': {'sub-key1': ['a','b','c','d','e']}, 'key2': {'sub-key2': ['1','2','3','5','8','9','10']}} With the help of this post, I managed to successfully convert this dictionary to a DataFrame. df = pd.DataFrame.from_dict({(i,j): d[i][j] for i in d.keys() for j in d[i].keys()}, orient='index') However, my DataFrame takes the following form: 0 1 2 3 4 5 6 (key1, sub-key1) a b c d e None None (key2, sub-key2) 1 2 3 5 8 9 10 I can work with tuples, as index

Slice MultiIndex pandas DataFrame by position

懵懂的女人 提交于 2019-12-22 07:57:09
问题 I am currently trying to to slice a MuliIndex DataFrame that has three levels by position. I am using pandas 19.1 Level0 Level1 Level2 Value 03-00368 A Item111 6.9 03-00368 A Item333 19.2 03-00368 B Item111 9.7 03-00368 B Item222 17.4 04-00176 C Item110 17.4 04-00176 C Item111 9.7 04-00246 D Item46 12.5 04-00246 D Item66 5.6 04-00246 D Item99 11.2 04-00247 E Item23 12.5 04-00247 E Item24 5.6 04-00247 E Item111 11.2 04-00247 F Item23 7.9 04-00247 F Item24 9.7 04-00247 F Item111 12.5 04-00247 G

Slice MultiIndex pandas DataFrame by position

谁都会走 提交于 2019-12-22 07:57:08
问题 I am currently trying to to slice a MuliIndex DataFrame that has three levels by position. I am using pandas 19.1 Level0 Level1 Level2 Value 03-00368 A Item111 6.9 03-00368 A Item333 19.2 03-00368 B Item111 9.7 03-00368 B Item222 17.4 04-00176 C Item110 17.4 04-00176 C Item111 9.7 04-00246 D Item46 12.5 04-00246 D Item66 5.6 04-00246 D Item99 11.2 04-00247 E Item23 12.5 04-00247 E Item24 5.6 04-00247 E Item111 11.2 04-00247 F Item23 7.9 04-00247 F Item24 9.7 04-00247 F Item111 12.5 04-00247 G

How to properly pivot or reshape a timeseries dataframe in Pandas?

烂漫一生 提交于 2019-12-21 21:25:17
问题 I need to reshape a dataframe that looks like df1 and turn it into df2. There are 2 considerations for this procedure: I need to be able to set the number of rows to be sliced as a parameter (length). I need to split date and time from the index, and use date in the reshape as the column names and keep time as the index. Current df1 2007-08-07 18:00:00 1 2007-08-08 00:00:00 2 2007-08-08 06:00:00 3 2007-08-08 12:00:00 4 2007-08-08 18:00:00 5 2007-11-02 18:00:00 6 2007-11-03 00:00:00 7 2007-11

Convert MultiIndex DataFrame to Series

放肆的年华 提交于 2019-12-21 20:54:10
问题 I created a multiIndex DataFrame by: df.set_index(['Field1', 'Field2'], inplace=True) If this is not a multiIndex DataFrame please tell me how to make one. I want to: Group by the same columns that are in the index Aggregate a count of each group Then return the whole thing as a Series with Field1 and Field2 as the index How do I go about doing this? ADDITIONAL INFO I have a multiIndex dataFrame that looks like this: Continent Sector Count Asia 1 4 2 1 Australia 1 1 Europe 1 1 2 3 3 2 North

time slice on second level of multiindex

南笙酒味 提交于 2019-12-21 17:03:10
问题 pandas allows for cool slicing on time indexes. For example, I can slice a dataframe df for the months from Janurary 2012 to March 2012 by doing: df['2012-01':'2012-03'] However, I have a dataframe df with a multiindex where the time index is the second level. It looks like: A B C D E a 2001-01-31 0.864841 0.789273 0.370031 0.448256 0.178515 2001-02-28 0.991861 0.079215 0.900788 0.666178 0.693887 2001-03-31 0.016674 0.855109 0.984115 0.436574 0.480339 2001-04-30 0.120924 0.046013 0.659807 0

Set value multiindex Pandas

蹲街弑〆低调 提交于 2019-12-21 09:19:28
问题 I'm a newbie to both Python and Pandas. I am trying to construct a dataframe, and then later populate it with values. I have constructed my dataframe from pandas import * ageMin = 21 ageMax = 31 ageStep = 2 bins_sumins = [0, 10000, 20000] bins_age = list(range(ageMin, ageMax, ageStep)) indeks_sex = ['M', 'F'] indeks_age = ['[{0}-{1})'.format(bins_age[i-1], bins_age[i]) for i in range(1, len(bins_age))] indeks_sumins = ['[{0}-{1})'.format(bins_sumins[i-1], bins_sumins[i]) for i in range(1, len