I have a pandas DataFrame myDF
with a few string columns (whose dtype
is object
) and many numeric columns. I tried the following:
This warning ONLY happens if you have mixed-types IN a column. Not just strings, but string AND numbers.
In [2]: DataFrame({ 'A' : [1.0,'foo'] }).to_hdf('test.h5','df',mode='w')
pandas/io/pytables.py:2439: PerformanceWarning:
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed,key->block0_values] [items->['A']]
warnings.warn(ws, PerformanceWarning)
In [3]: df = DataFrame({ 'A' : [1.0,'foo'] })
In [4]: df
Out[4]:
A
0 1
1 foo
[2 rows x 1 columns]
In [5]: df.dtypes
Out[5]:
A object
dtype: object
In [6]: df['A']
Out[6]:
0 1
1 foo
Name: A, dtype: object
In [7]: df['A'].values
Out[7]: array([1.0, 'foo'], dtype=object)
So, you need to ensure that you don't mix WITHIN a column
If you have columns that need conversion you can do this:
In [9]: columns = ['A']
In [10]: df.loc[:,columns] = df[columns].applymap(str)
In [11]: df
Out[11]:
A
0 1.0
1 foo
[2 rows x 1 columns]
In [12]: df['A'].values
Out[12]: array(['1.0', 'foo'], dtype=object)