Adding new data into HDF5 file results an empty array

点点圈 提交于 2021-01-28 19:00:59

问题


While playing with HDF5 package for Python I discovered a strange behavior. I want to insert more data into table. But somehow I cannot get it work properly. As you can see from the source code, I am getting the last row of data in key 'X' using fromRow = hf["X"].shape[0] and writing the tempArray2 afterwards. Result is an empty table.

import h5py

tempArray1 = [[0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443]]
tempArray2 = [[3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14]]

with h5py.File('data.hdf5', 'w') as hf:
    # Add data to new file
    dset = hf.create_dataset("X", data=tempArray1, compression="gzip", chunks=True, maxshape=(None,3), dtype='f4') # Size is as the size of tempArray1
    print(hf["X"].shape[0])

    # Append data existing file
    hf["X"].resize((hf["X"].shape[0] + 10, 3)) # Size is as the size of X+ 10
    print(hf["X"].shape[0])
    fromRow = hf["X"].shape[0]
    hf["X"][fromRow:] = tempArray2

This is how it looks:

Key: X
Data:
 [[ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]]
Length of data: 20

Strangely enough when I replace the value fromRow with number 10 like fromRow = 10, which represents the end of the existing table, it works.

Output:

Key: X
Data:
 [[ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]]
Length of data: 20

Any idea what I am doing wrong?


回答1:


You are getting fromRow AFTER you resize the X dataset. You need the value BEFORE your resize. See code below.

with h5py.File('data.hdf5', 'w') as hf:
    # Add data to new file
    dset = hf.create_dataset("X", data=tempArray1, compression="gzip", chunks=True, maxshape=(None,3), dtype='f4') # Size is as the size of tempArray1
    print(hf["X"].shape[0])
# new location to get fromRow:
    fromRow = hf["X"].shape[0]

    # Append data existing file
    hf["X"].resize((hf["X"].shape[0] + 10, 3)) # Size is as the size of X+ 10
    print(hf["X"].shape[0])        
    hf["X"][fromRow:] = tempArray2


来源:https://stackoverflow.com/questions/60562311/adding-new-data-into-hdf5-file-results-an-empty-array

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!