multi-index

Converting a pandas MultiIndex DataFrame from rows-wise to column-wise

时光毁灭记忆、已成空白 提交于 2019-12-21 04:17:12
问题 I'm working in zipline and pandas and have converted a pandas.Panel to a pandas.DataFrame using the to_frame() method. This is the resulting pandas.DataFrame which as you can see is multi-indexed: price major minor 2008-01-03 00:00:00+00:00 SPY 129.93 KO 26.38 PEP 64.78 2008-01-04 00:00:00+00:00 SPY 126.74 KO 26.43 PEP 64.59 2008-01-07 00:00:00+00:00 SPY 126.63 KO 27.05 PEP 66.10 2008-01-08 00:00:00+00:00 SPY 124.59 KO 27.16 PEP 66.63 I need to convert this frame to look like this: SPY KO PEP

add a field in pandas dataframe with MultiIndex columns

跟風遠走 提交于 2019-12-20 12:40:12
问题 i have looked for an answer to this question as it seems pretty simple, but have not been able to find anything yet. Apologies if I missed something. I have pandas version 0.10.0 and I have been experimenting with data of the following form: import pandas import numpy as np import datetime start_date = datetime.datetime(2009,3,1,6,29,59) r = pandas.date_range(start_date, periods=12) cols_1 = ['AAPL', 'AAPL', 'GOOG', 'GOOG', 'GS', 'GS'] cols_2 = ['close', 'rate', 'close', 'rate', 'close',

Setting DataFrame column headers to a MultiIndex

六月ゝ 毕业季﹏ 提交于 2019-12-18 19:39:11
问题 How do I convert an existing dataframe with single-level columns to have hierarchical index columns (MultiIndex)? Example dataframe: In [1]: import pandas as pd from pandas import Series, DataFrame df = DataFrame(np.arange(6).reshape((2,3)), index=['A','B'], columns=['one','two','three']) df Out [1]: one two three A 0 1 2 B 3 4 5 I'd have thought that reindex() would work, but I get NaN's: In [2]: df.reindex(columns=[['odd','even','odd'],df.columns]) Out [2]: odd even odd one two three A NaN

Pandas style object with multi-index

ⅰ亾dé卋堺 提交于 2019-12-18 15:17:52
问题 I am formatting a pandas dataframe with styler to highlight columns and format numbers. I also want to apply multi-index for more clear, pleasant and easy to read. Since I apply Styler to subset of columns it does not work work with the multi-index. Example: arrays = [np.hstack([['One']*2, ['Two']*2]) , ['A', 'B', 'C', 'D']] columns = pd.MultiIndex.from_arrays(arrays) data = pd.DataFrame(np.random.randn(5, 4), columns=list('ABCD')) data.columns = columns import seaborn as sns cm = sns.light

Showing all index values when using multiIndexing in Pandas

浪尽此生 提交于 2019-12-18 09:17:04
问题 I would like that when viewing my DataFrame I will see all values of the multiIndex, including when subsequent rows have the same index for one of the levels. Here is an example: arrays = [['20', '50', '20', '20'],['N/A', 'N/A', '10', '30']] tuples = list(zip(*arrays)) index = pd.MultiIndex.from_tuples(tuples, names=['Jim', 'Betty']) pd.DataFrame([np.random.rand(1)]*4,index=index) The output is: 0 Jim Betty 20 N/A 0.954973 50 N/A 0.954973 20 10 0.954973 30 0.954973 I would like to have a 20

Pandas groupby(),agg() - how to return results without the multi index?

天大地大妈咪最大 提交于 2019-12-18 03:17:52
问题 I have a dataframe: pe_odds[ [ 'EVENT_ID', 'SELECTION_ID', 'ODDS' ] ] Out[67]: EVENT_ID SELECTION_ID ODDS 0 100429300 5297529 18.00 1 100429300 5297529 20.00 2 100429300 5297529 21.00 3 100429300 5297529 22.00 4 100429300 5297529 23.00 5 100429300 5297529 24.00 6 100429300 5297529 25.00 When I use groupby and agg, I get results with a multi-index: pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] ) Out[68]: amin amax EVENT_ID SELECTION_ID 100428417 5490293 1

Reshaping dataframes in pandas based on column labels

為{幸葍}努か 提交于 2019-12-17 22:42:25
问题 What is the best way to reshape the following dataframe in pandas? This DataFrame df has x,y values for each sample ( s1 and s2 in this case) and looks like this: In [23]: df = pandas.DataFrame({"s1_x": scipy.randn(10), "s1_y": scipy.randn(10), "s2_x": scipy.randn(10), "s2_y": scipy.randn(10)}) In [24]: df Out[24]: s1_x s1_y s2_x s2_y 0 0.913462 0.525590 -0.377640 0.700720 1 0.723288 -0.691715 0.127153 0.180836 2 0.181631 -1.090529 -1.392552 1.530669 3 0.997414 -1.486094 1.207012 0.376120 4

Multiindex and timezone - Frozen list error

泪湿孤枕 提交于 2019-12-17 20:27:05
问题 I try to change the timezone of a multiindex DataFramebut I get an frozen list error. Has someone any idea how to proceed ? >>> array = [('s001', d) for d in pd.date_range(start='01/01/2014', end='01/01/2015', freq='H')] + [('s002', d) for d in pd.date_range(start='01/01/2014', end='01/01/2015', freq='H')] >>> index = pd.MultiIndex.from_tuples(array, names=['sce', 'DATES']) >>> df = pd.DataFrame(np.random.randn(len(index)), index=index) >>> df.index.levels[1] = df.index.levels[1].tz_localize(

How to do group by on a multiindex in pandas?

人盡茶涼 提交于 2019-12-17 15:56:13
问题 Below is my dataframe. I made some transformations to create the category column and dropped the original column it was derived from. Now I need to do a group-by to remove the dups e.g. Love and Fashion can be rolled up via a groupby sum. df.colunms = array([category, clicks, revenue, date, impressions, size], dtype=object) df.values= [[Love 0 0.36823 2013-11-04 380 300x250] [Love 183 474.81522 2013-11-04 374242 300x250] [Fashion 0 0.19434 2013-11-04 197 300x250] [Fashion 9 18.26422 2013-11

Filling in date gaps in MultiIndex Pandas Dataframe

淺唱寂寞╮ 提交于 2019-12-17 10:46:14
问题 I would like to modify a pandas MultiIndex DataFrame such that each index group includes Dates between a specified range. I would like each group to fill in missing dates 2013-06-11 to 2013-12-31 with the value 0 (or NaN ). Group A, Group B, Date, Value loc_a group_a 2013-06-11 22 2013-07-02 35 2013-07-09 14 2013-07-30 9 2013-08-06 4 2013-09-03 40 2013-10-01 18 group_b 2013-07-09 4 2013-08-06 2 2013-09-03 5 group_c 2013-07-09 1 2013-09-03 2 loc_b group_a 2013-10-01 3 I've seen a few