I have large pandas DataFrames with financial data. I have no problem appending and concatenating additional columns and DataFrames to my .h5 file.
The financial dat
tohlcv_candle.to_hdf('test.h5',key='this_is_a_key', append=True, mode='r+', format='t')
You need to pass another argument append=True
to specify that the data is to be appended to existing data if found under that key, instead of over-writing it.
Without this, the default is False
and if it encounters an existing table under 'this_is_a_key'
then it overwrites.
The mode=
argument is only at file-level, telling whether the file as a whole is to be overwritten or appended.
One file can have any number of keys, so a mode='a', append=False
setting will mean only one key gets over-written while the other keys stay.
I had a similar experience as yours and found the additional append argument in the reference doc. After setting it, now it's appending properly for me.
Ref: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_hdf.html
Note: hdf5 won't bother doing anything with the dataframe's indexes. We need to iron those out before putting the data in or when we take it out.