Pandas has the following examples for how to store Series
, DataFrames
and Panels
in HDF5 files:
As soon as the statement is exectued, eg store['df'] = df
. The close
just closes the actual file (which will be closed for you if the process exists, but will print a warning message)
Read the section http://pandas.pydata.org/pandas-docs/dev/io.html#storing-in-table-format
It is generally not a good idea to put a LOT of nodes in an .h5
file. You probably want to append and create a smaller number of nodes.
You can just iterate thru your .csv
and store/append
them one by one. Something like:
for f in files:
df = pd.read_csv(f)
df.to_hdf('file.h5',f,df)
Would be one way (creating a separate node for each file)
Not appendable - once you write it, you can only retrieve it all at once, e.g. you cannot select a sub-section
If you have a table, then you can do things like:
pd.read_hdf('my_store.h5','a_table_node',['index>100'])
which is like a database query, only getting part of the data
Thus, a store is not appendable, nor queryable, while a table is both.