Python 2.7: Appending Data to Table in Pandas

假如想象 提交于 2021-02-08 09:29:14

问题


I am reading data from image files and I want to append this data into a single HDF file. Here is my code:

datafile = pd.HDFStore(os.path.join(path,'imageData.h5'))
for file in fileList: 
     data = {'X Position' :  pd.Series(xpos, index=index1),
             'Y Position' :  pd.Series(ypos, index=index1),
             'Major Axis Length' :  pd.Series(major, index=index1),
             'Minor Axis Length' :  pd.Series(minor, index=index1), 
             'X Velocity' :  pd.Series(xVelocity, index=index1),
             'Y Velocity' :  pd.Series(yVelocity, index=index1) }
    df = pd.DataFrame(data)
    datafile['df'] = df
    datafile.close()

This is obviously incorrect as it overwrites each set of data with the new one each time the loop runs.

If instead of datafile['df'] = df, I use

datafile.append('df',df)    

OR

df.to_hdf(os.path.join(path,'imageData.h5'), 'df', append=True, format = 'table')

I get the error:

ValueError: Can only append to Tables

I have referred to the documentation and other SO questions, without avail.

So, I am hoping someone can explain why this isn't working and how I can successfully append all the data to one file. I am willing to use a different method (perhaps pyTables) if necessary.

Any help would be greatly appreciated.


回答1:


This will work in 0.11. Once you create a group (e.g the label where you are storing data, the 'df' here). If you store a fixed format it will overwrite (and if you try to append will give you the above error msg); if you write a table format you can append. Note that in 0.11, to_hdf does not correctly pass keywords thru to the underlying function so you can use it ONLY to write a fixed format.

datafile = pd.HDFStore(os.path.join(path,'imageData.h5'),mode='w')
for file in fileList: 
     data = {'X Position' :  pd.Series(xpos, index=index1),
             'Y Position' :  pd.Series(ypos, index=index1),
             'Major Axis Length' :  pd.Series(major, index=index1),
             'Minor Axis Length' :  pd.Series(minor, index=index1), 
             'X Velocity' :  pd.Series(xVelocity, index=index1),
             'Y Velocity' :  pd.Series(yVelocity, index=index1) }
    df = pd.DataFrame(data)
    datafile.append('df',df)
datafile.close


来源:https://stackoverflow.com/questions/22035360/python-2-7-appending-data-to-table-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!