NumPy or Pandas: Keeping array type as integer while having a NaN value

前端 未结 8 969
粉色の甜心
粉色の甜心 2020-11-22 06:05

Is there a preferred way to keep the data type of a numpy array fixed as int (or int64 or whatever), while still having an element ins

8条回答
  •  礼貌的吻别
    2020-11-22 06:17

    If there are blanks in the text data, columns that would normally be integers will be cast to floats as float64 dtype because int64 dtype cannot handle nulls. This can cause inconsistent schema if you are loading multiple files some with blanks (which will end up as float64 and others without which will end up as int64

    This code will attempt to convert any number type columns to Int64 (as opposed to int64) since Int64 can handle nulls

    import pandas as pd
    import numpy as np
    
    #show datatypes before transformation
    mydf.dtypes
    
    for c in mydf.select_dtypes(np.number).columns:
        try:
            mydf[c] = mydf[c].astype('Int64')
            print('casted {} as Int64'.format(c))
        except:
            print('could not cast {} to Int64'.format(c))
    
    #show datatypes after transformation
    mydf.dtypes
    

提交回复
热议问题