HDFStore with string columns gives issues

后端 未结 1 1181
再見小時候
再見小時候 2020-12-31 18:26

I have a pandas DataFrame myDF with a few string columns (whose dtype is object) and many numeric columns. I tried the following:

相关标签:
1条回答
  • 2020-12-31 18:58

    This warning ONLY happens if you have mixed-types IN a column. Not just strings, but string AND numbers.

    In [2]: DataFrame({ 'A' : [1.0,'foo'] }).to_hdf('test.h5','df',mode='w')
    pandas/io/pytables.py:2439: PerformanceWarning: 
    your performance may suffer as PyTables will pickle object types that it cannot
    map directly to c-types [inferred_type->mixed,key->block0_values] [items->['A']]
    
      warnings.warn(ws, PerformanceWarning)
    
    In [3]: df = DataFrame({ 'A' : [1.0,'foo'] })
    
    In [4]: df
    Out[4]: 
         A
    0    1
    1  foo
    
    [2 rows x 1 columns]
    
    In [5]: df.dtypes
    Out[5]: 
    A    object
    dtype: object
    
    In [6]: df['A']
    Out[6]: 
    0      1
    1    foo
    Name: A, dtype: object
    
    In [7]: df['A'].values
    Out[7]: array([1.0, 'foo'], dtype=object)
    

    So, you need to ensure that you don't mix WITHIN a column

    If you have columns that need conversion you can do this:

    In [9]: columns = ['A']
    
    In [10]: df.loc[:,columns] = df[columns].applymap(str)
    
    In [11]: df
    Out[11]: 
         A
    0  1.0
    1  foo
    
    [2 rows x 1 columns]
    
    In [12]: df['A'].values
    Out[12]: array(['1.0', 'foo'], dtype=object)
    
    0 讨论(0)
提交回复
热议问题