How does one store a Pandas DataFrame as an HDF5 PyTables table (or CArray, EArray, etc.)?

北战南征 提交于 2019-12-07 09:11:54

问题


I have the following pandas dataframe:

import pandas as pd
df = pd.read_csv(filename.csv)

Now, I can use HDFStore to write the df object to file (like adding key-value pairs to a Python dictionary):

store = HDFStore('store.h5')
store['df'] = df

http://pandas.pydata.org/pandas-docs/stable/io.html

When I look at the contents, this object is a frame.

store 

outputs

<class 'pandas.io.pytables.HDFStore'>
File path: store.h5
/df            frame        (shape->[552,23252])

However, in order to use indexing, one should store this as a table object.

My approach was to try HDFStore.put() i.e.

HDFStore.put(key="store.h", value=df, format=Table)

However, this fails with the error:

TypeError: put() missing 1 required positional argument: 'self'

How does one save Pandas Dataframes as PyTables tables?


回答1:


common part - create or open existing HDFStore file:

store = pd.HDFStore('store.h5')

Try this if you want to have indexed all columns:

store.append('key_name', df, data_columns=True)

or this if you want to have indexed just a subset of columns:

store.append('key_name', df, data_columns=['colA','colC','colN'])

PS HDFStore.append() saves DFs per default in table format




回答2:


How does one save Pandas Dataframes as PyTables tables?

Adding to the accepted answer, you should always close the PyTable file. For convenience, Pandas provides the HDFStore as a context manager:

with pd.HDFStore('/path/to/data.hdf') as hdf:
   hdf.put(key="store.h", value=df, format='table', data_columns=True)


来源:https://stackoverflow.com/questions/38460744/how-does-one-store-a-pandas-dataframe-as-an-hdf5-pytables-table-or-carray-earr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!