Converting string objects to int/float using pandas

后端 未结 2 627
时光取名叫无心
时光取名叫无心 2021-02-04 01:18
import pandas as pd

path1 = \"/home/supertramp/Desktop/100&life_180_data.csv\"

mydf =  pd.read_csv(path1)

numcigar = {\"Never\":0 ,\"1-5 Cigarettes/day\" :1,\"10-         


        
2条回答
  •  礼貌的吻别
    2021-02-04 02:02

    OK, first problem is you have embedded spaces causing the function to incorrectly apply:

    fix this using vectorised str:

    mydf['Cigarettes'] = mydf['Cigarettes'].str.replace(' ', '')
    

    now create your new column should just work:

    mydf['CigarNum'] = mydf['Cigarettes'].apply(numcigar.get).astype(float)
    

    UPDATE

    Thanks to @Jeff as always for pointing out superior ways to do things:

    So you can call replace instead of calling apply:

    mydf['CigarNum'] = mydf['Cigarettes'].replace(numcigar)
    # now convert the types
    mydf['CigarNum'] = mydf['CigarNum'].convert_objects(convert_numeric=True)
    

    you can also use factorize method also.

    Thinking about it why not just set the dict values to be floats anyway and then you avoid the type conversion?

    So:

    numcigar = {"Never":0.0 ,"1-5 Cigarettes/day" :1.0,"10-20 Cigarettes/day":4.0}
    

    Version 0.17.0 or newer

    convert_objects is deprecated since 0.17.0, this has been replaced with to_numeric

    mydf['CigarNum'] = pd.to_numeric(mydf['CigarNum'], errors='coerce')
    

    Here errors='coerce' will return NaN where the values cannot be converted to a numeric value, without this it will raise an exception

提交回复
热议问题