问题
I am getting the following PerformanceWarning:
"PerformanceWarning:
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed-integer,key->block0_values] [items-> ['File1', 'File2', 'File3', 'File4', 'File5']]
warnings.warn(ws, PerformanceWarning)"
This is because of the "file_attributes" df (see code below), which contains a mix of several things:
Typical output for store.file_attributes (basically a bunch of dictionary key/value pairs with some nested dictionaries):
File1 \
duration 0.2
linescan file GluAlone-001_Cycle00001_LineProfileData.csv
primary {'unit': 'mV', 'divisor': 0.1}
sampling 20000
secondary {'unit': 'pA', 'divisor': 0.0005}
voltage recording file GluAlone-001_Cycle00001_VoltageRecording_001
File2 \
duration 0.2
linescan file GluAlone-001_Cycle00002_LineProfileData.csv
primary {'unit': 'mV', 'divisor': 0.1}
sampling 20000
secondary {'unit': 'pA', 'divisor': 0.0005}
voltage recording file GluAlone-001_Cycle00002_VoltageRecording_001
ETC.
The function I've written for this is pulling in data from another function that parses a folder of data files:
def convert_folder_hdf5(folder, save_loc = None):
if save_loc==None:
save_loc = folder
filename = save_loc+ '\\' + (folder.split('\\')[-1])+'.h5'
store = pd.HDFStore(filename, format = "table", complevel=9, complib='blosc')
data = import_folder(folder)
store['voltage_recording'] = data['voltage recording']
store['linescan'] = data['linescan']
store['file_attributes'] = pd.DataFrame.from_dict(data['file attributes'])
#store.close()
return store
I'm not sure how to deal with this warning, though. Searching around the two posts I found that might be appropriate contained solutions that did not get me anywhere.
Any ideas?
来源:https://stackoverflow.com/questions/25232707/performancewarning-pandas-and-pytables-can-i-fix-this