问题
I have the following pandas dataframe:
import pandas as pd
df = pd.read_csv(filename.csv)
Now, I can use HDFStore
to write the df
object to file (like adding key-value pairs to a Python dictionary):
store = HDFStore('store.h5')
store['df'] = df
http://pandas.pydata.org/pandas-docs/stable/io.html
When I look at the contents, this object is a frame
.
store
outputs
<class 'pandas.io.pytables.HDFStore'>
File path: store.h5
/df frame (shape->[552,23252])
However, in order to use indexing, one should store this as a table
object.
My approach was to try HDFStore.put()
i.e.
HDFStore.put(key="store.h", value=df, format=Table)
However, this fails with the error:
TypeError: put() missing 1 required positional argument: 'self'
How does one save Pandas Dataframes as PyTables tables?
回答1:
common part - create or open existing HDFStore file:
store = pd.HDFStore('store.h5')
Try this if you want to have indexed all columns:
store.append('key_name', df, data_columns=True)
or this if you want to have indexed just a subset of columns:
store.append('key_name', df, data_columns=['colA','colC','colN'])
PS HDFStore.append()
saves DFs per default in table
format
回答2:
How does one save Pandas Dataframes as PyTables tables?
Adding to the accepted answer, you should always close the PyTable file. For convenience, Pandas provides the HDFStore as a context manager:
with pd.HDFStore('/path/to/data.hdf') as hdf:
hdf.put(key="store.h", value=df, format='table', data_columns=True)
来源:https://stackoverflow.com/questions/38460744/how-does-one-store-a-pandas-dataframe-as-an-hdf5-pytables-table-or-carray-earr