Find all columns of dataframe in Pandas whose type is float, or a particular type?

旧街凉风 提交于 2019-11-28 18:28:32

问题


I have a dataframe, df, that has some columns of type float64, while the others are of object. Due to the mixed nature, I cannot use

df.fillna('unknown') #getting error "ValueError: could not convert string to float:"

as the error happened with the columns whose type is float64 (what a misleading error message!)

so I'd wish that I could do something like

for col in df.columns[<dtype == object>]:
    df[col] = df[col].fillna("unknown")

So my question is if there is any such filter expression that I can use with df.columns?

I guess alternatively, less elegantly, I could do:

 for col in df.columns:
        if (df[col].dtype == dtype('O')): # for object type
            df[col] = df[col].fillna('') 
            # still puzzled, only empty string works as replacement, 'unknown' would not work for certain value leading to error of "ValueError: Error parsing datetime string "unknown" at position 0" 

I also would like to know why in the above code replacing '' with 'unknown' the code would work for certain cells but failed with a cell with the error of "ValueError: Error parsing datetime string "unknown" at position 0"

Thanks a lot!

Yu


回答1:


You can see what the dtype is for all the columns using the dtypes attribute:

In [11]: df = pd.DataFrame([[1, 'a', 2.]])

In [12]: df
Out[12]: 
   0  1  2
0  1  a  2

In [13]: df.dtypes
Out[13]: 
0      int64
1     object
2    float64
dtype: object

In [14]: df.dtypes == object
Out[14]: 
0    False
1     True
2    False
dtype: bool

To access the object columns:

In [15]: df.loc[:, df.dtypes == object]
Out[15]: 
   1
0  a

I think it's most explicit to use (I'm not sure that inplace would work here):

In [16]: df.loc[:, df.dtypes == object] = df.loc[:, df.dtypes == object].fillna('')

Saying that, I recommend you use NaN for missing data.




回答2:


This is conciser:

# select the float columns
df_num = df.select_dtypes(include=[np.float])
# select non-numeric columns
df_num = df.select_dtypes(exclude=[np.number])


来源:https://stackoverflow.com/questions/21720022/find-all-columns-of-dataframe-in-pandas-whose-type-is-float-or-a-particular-typ

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!