How to copy a dataset object to a different hdf5 file using pytables or h5py?

僤鯓⒐⒋嵵緔 提交于 2020-08-09 07:59:51

问题


I have selected specific hdf5 datasets and want to copy them to a new hdf5 file. I could find some tutorials on copying between two files, but what if you have just created a new file and you want to copy datasets to the file? I thought the way below would work, but it doesn't. Are there any simple ways to do this?

>>> dic_oldDataset['old_dataset']
<HDF5 dataset "old_dataset": shape (333217,), type "|V14">

>>> new_file = h5py.File('new_file.h5', 'a')
>>> new_file.create_group('new_group')

>>> new_file['new_group']['new_dataset'] = dic_oldDataset['old_dataset']


RuntimeError: Unable to create link (interfile hard links are not allowed)

回答1:


Answer 1 (using h5py):
This creates a simple structured array to populate the first dataset in the first file. The data is then read from that dataset and copied to the second file using my_array.

import h5py, numpy as np

arr = np.array([(1,'a'), (2,'b')], 
      dtype=[('foo', int), ('bar', 'S1')]) 
print (arr.dtype)

h5file1 = h5py.File('test1.h5', 'w')
h5file1.create_dataset('/ex_group1/ex_ds1', data=arr)                
print (h5file1)

my_array=h5file1['/ex_group1/ex_ds1']

h5file2 = h5py.File('test2.h5', 'w')
h5file2.create_dataset('/exgroup2/ex_ds2', data=my_array)
print (h5file2)

h5file1.close()
h5file2.close()



回答2:


Answer 2 (using pytables):
This follows the same process as above with pytables functions. It creates the same simple structured array to populate the first dataset in the first file. The data is then read from that dataset and copied to the second file using my_array.

import tables, numpy as np

arr = np.array([(1,'a'), (2,'b')], 
      dtype=[('foo', int), ('bar', 'S1')]) 
print (arr.dtype)
h5file1 = tables.open_file('test1.h5', mode = 'w', title = 'Test file')
my_group = h5file1.create_group('/', 'ex_group1', 'Example Group')
my_table = h5file1.create_table(my_group, 'ex_ds1', None, 'Example dataset', obj=arr)                
print (h5file1)

my_array=my_table.read()

h5file2 = tables.open_file('test2.h5', mode = 'w', title = 'Test file')
h5file2.create_table('/exgroup2', 'ex_ds2', createparents=True, obj=my_array)
print (h5file2)

h5file1.close()
h5file2.close()


来源:https://stackoverflow.com/questions/53455713/how-to-copy-a-dataset-object-to-a-different-hdf5-file-using-pytables-or-h5py

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!