问题
After searching a lot I couldn't find a simple way to extract data from .h5
and pass it to a data.Frame
by Numpy
or Pandas
in order to save in .txt
or .csv
file.
import h5py
import numpy as np
import pandas as pd
filename = 'D:\data.h5'
f = h5py.File(filename, 'r')
# List all groups
print("Keys: %s" % f.keys())
a_group_key = list(f.keys())[0]
# Get the data
data = list(f[a_group_key])
pd.DataFrame(data).to_csv("hi.csv")
Keys: <KeysViewHDF5 ['dd48']>
When I print data I see following results:
print(data)
['axis0',
'axis1',
'block0_items',
'block0_values',
'block1_items',
'block1_values']
I would appreciate the if someone explain me what are they and how I can extract data completely and save it in .csv file. It seems there hasn't been a routine way to do that and it's kind of challenging yet! Until now I just could see part of data via:
import numpy as np
dfm = np.fromfile('D:\data.h5', dtype=float)
print (dfm.shape)
print(dfm[5:])
dfm=pd.to_csv('train.csv')
#dfm.to_csv('hi.csv', sep=',', header=None, index=None)
My expectation is to extract time_stamps and measurements in .h5
file.
回答1:
It looks like that data was written by Pandas, so use pd.read_hdf() to read it.
回答2:
h5py
will access HDF5 datasets as numpy arrays. Your call to get the keys returns a LIST of the dataset names. Now that you have them, it should be pretty simple to access them as a numpy array and write them. You need to get the dtype to know what is in each column to format correctly.
Updated 5/22/2019 to reflect content of data.h5
posted at link in comment.
Default format in np.savetxt()
is '%.18e'
. Very simple (crude) logic provided to modify format based on dtype for these datasets. This requires more robust dtype checking and formatting for general use. Also, you will need to add logic to decode unicode strings.
import h5py
filename = 'D:\data.h5'
import numpy as np
h5f = h5py.File(filename, 'r')
# get a List of data sets in group 'dd48'
a_dset_keys = list(h5f['dd48'].keys())
# Get the data
for dset in a_dset_keys :
ds_data = (h5f['dd48'][dset])
print ('dataset=', dset)
print (ds_data.dtype)
if ds_data.dtype == 'float64' :
csvfmt = '%.18e'
elif ds_data.dtype == 'int64' :
csvfmt = '%.10d'
else:
csvfmt = '%s'
np.savetxt('output_'+dset+'.csv', ds_data, fmt=csvfmt, delimiter=',')
来源:https://stackoverflow.com/questions/56238200/how-can-extract-data-from-h5-file-and-save-it-in-txt-or-csv-properly