Converting string objects to int/float using pandas

后端 未结 2 622
时光取名叫无心
时光取名叫无心 2021-02-04 01:18
import pandas as pd

path1 = \"/home/supertramp/Desktop/100&life_180_data.csv\"

mydf =  pd.read_csv(path1)

numcigar = {\"Never\":0 ,\"1-5 Cigarettes/day\" :1,\"10-         


        
相关标签:
2条回答
  • 2021-02-04 01:54

    Try using this function for all problems of this kind:

    def get_series_ids(x):
        '''Function returns a pandas series consisting of ids, 
           corresponding to objects in input pandas series x
           Example: 
           get_series_ids(pd.Series(['a','a','b','b','c'])) 
           returns Series([0,0,1,1,2], dtype=int)'''
    
        values = np.unique(x)
        values2nums = dict(zip(values,range(len(values))))
        return x.replace(values2nums)
    
    0 讨论(0)
  • 2021-02-04 02:02

    OK, first problem is you have embedded spaces causing the function to incorrectly apply:

    fix this using vectorised str:

    mydf['Cigarettes'] = mydf['Cigarettes'].str.replace(' ', '')
    

    now create your new column should just work:

    mydf['CigarNum'] = mydf['Cigarettes'].apply(numcigar.get).astype(float)
    

    UPDATE

    Thanks to @Jeff as always for pointing out superior ways to do things:

    So you can call replace instead of calling apply:

    mydf['CigarNum'] = mydf['Cigarettes'].replace(numcigar)
    # now convert the types
    mydf['CigarNum'] = mydf['CigarNum'].convert_objects(convert_numeric=True)
    

    you can also use factorize method also.

    Thinking about it why not just set the dict values to be floats anyway and then you avoid the type conversion?

    So:

    numcigar = {"Never":0.0 ,"1-5 Cigarettes/day" :1.0,"10-20 Cigarettes/day":4.0}
    

    Version 0.17.0 or newer

    convert_objects is deprecated since 0.17.0, this has been replaced with to_numeric

    mydf['CigarNum'] = pd.to_numeric(mydf['CigarNum'], errors='coerce')
    

    Here errors='coerce' will return NaN where the values cannot be converted to a numeric value, without this it will raise an exception

    0 讨论(0)
提交回复
热议问题