Converting strings to floats in a DataFrame

前端 未结 6 772
无人及你
无人及你 2020-11-27 12:30

How to covert a DataFrame column containing strings and NaN values to floats. And there is another column whose values are strings and floats; how to convert th

相关标签:
6条回答
  • 2020-11-27 12:42

    Here is an example

                                GHI             Temp  Power Day_Type
    2016-03-15 06:00:00 -7.99999952505459e-7    18.3    0   NaN
    2016-03-15 06:01:00 -7.99999952505459e-7    18.2    0   NaN
    2016-03-15 06:02:00 -7.99999952505459e-7    18.3    0   NaN
    2016-03-15 06:03:00 -7.99999952505459e-7    18.3    0   NaN
    2016-03-15 06:04:00 -7.99999952505459e-7    18.3    0   NaN
    

    but if this is all string values...as was in my case... Convert the desired columns to floats:

    df_inv_29['GHI'] = df_inv_29.GHI.astype(float)
    df_inv_29['Temp'] = df_inv_29.Temp.astype(float)
    df_inv_29['Power'] = df_inv_29.Power.astype(float)
    

    Your dataframe will now have float values :-)

    0 讨论(0)
  • 2020-11-27 12:46
    df['MyColumnName'] = df['MyColumnName'].astype('float64') 
    
    0 讨论(0)
  • 2020-11-27 12:48

    You can try df.column_name = df.column_name.astype(float). As for the NaN values, you need to specify how they should be converted, but you can use the .fillna method to do it.

    Example:

    In [12]: df
    Out[12]: 
         a    b
    0  0.1  0.2
    1  NaN  0.3
    2  0.4  0.5
    
    In [13]: df.a.values
    Out[13]: array(['0.1', nan, '0.4'], dtype=object)
    
    In [14]: df.a = df.a.astype(float).fillna(0.0)
    
    In [15]: df
    Out[15]: 
         a    b
    0  0.1  0.2
    1  0.0  0.3
    2  0.4  0.5
    
    In [16]: df.a.values
    Out[16]: array([ 0.1,  0. ,  0.4])
    
    0 讨论(0)
  • 2020-11-27 12:48

    you have to replace empty strings ('') with np.nan before converting to float. ie:

    df['a']=df.a.replace('',np.nan).astype(float)
    
    0 讨论(0)
  • 2020-11-27 12:51

    In a newer version of pandas (0.17 and up), you can use to_numeric function. It allows you to convert the whole dataframe or just individual columns. It also gives you an ability to select how to treat stuff that can't be converted to numeric values:

    import pandas as pd
    s = pd.Series(['1.0', '2', -3])
    pd.to_numeric(s)
    s = pd.Series(['apple', '1.0', '2', -3])
    pd.to_numeric(s, errors='ignore')
    pd.to_numeric(s, errors='coerce')
    
    0 讨论(0)
  • 2020-11-27 12:56

    NOTE: pd.convert_objects has now been deprecated. You should use pd.Series.astype(float) or pd.to_numeric as described in other answers.

    This is available in 0.11. Forces conversion (or set's to nan) This will work even when astype will fail; its also series by series so it won't convert say a complete string column

    In [10]: df = DataFrame(dict(A = Series(['1.0','1']), B = Series(['1.0','foo'])))
    
    In [11]: df
    Out[11]: 
         A    B
    0  1.0  1.0
    1    1  foo
    
    In [12]: df.dtypes
    Out[12]: 
    A    object
    B    object
    dtype: object
    
    In [13]: df.convert_objects(convert_numeric=True)
    Out[13]: 
       A   B
    0  1   1
    1  1 NaN
    
    In [14]: df.convert_objects(convert_numeric=True).dtypes
    Out[14]: 
    A    float64
    B    float64
    dtype: object
    
    0 讨论(0)
提交回复
热议问题