PerformanceWarning - Pandas and Pytables, can I fix this?

问题

I am getting the following PerformanceWarning:

"PerformanceWarning: 
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed-integer,key->block0_values] [items->   ['File1', 'File2', 'File3', 'File4', 'File5']]

warnings.warn(ws, PerformanceWarning)"

This is because of the "file_attributes" df (see code below), which contains a mix of several things:

Typical output for store.file_attributes (basically a bunch of dictionary key/value pairs with some nested dictionaries):

File1 \

duration 0.2
linescan file GluAlone-001_Cycle00001_LineProfileData.csv
primary {'unit': 'mV', 'divisor': 0.1}
sampling 20000
secondary {'unit': 'pA', 'divisor': 0.0005}
voltage recording file GluAlone-001_Cycle00001_VoltageRecording_001

File2 \

duration 0.2
linescan file GluAlone-001_Cycle00002_LineProfileData.csv
primary {'unit': 'mV', 'divisor': 0.1}
sampling 20000
secondary {'unit': 'pA', 'divisor': 0.0005}
voltage recording file GluAlone-001_Cycle00002_VoltageRecording_001

ETC.

The function I've written for this is pulling in data from another function that parses a folder of data files:

def convert_folder_hdf5(folder, save_loc = None):
    if save_loc==None:
        save_loc = folder

    filename = save_loc+ '\\' + (folder.split('\\')[-1])+'.h5'
    store = pd.HDFStore(filename, format = "table", complevel=9, complib='blosc')
    data = import_folder(folder)

    store['voltage_recording'] = data['voltage recording']
    store['linescan'] = data['linescan']
    store['file_attributes'] = pd.DataFrame.from_dict(data['file attributes'])


    #store.close()

    return store

I'm not sure how to deal with this warning, though. Searching around the two posts I found that might be appropriate contained solutions that did not get me anywhere.

Any ideas?

来源：https://stackoverflow.com/questions/25232707/performancewarning-pandas-and-pytables-can-i-fix-this

标签

python

performance

pandas

pytables