NumPy or Pandas: Keeping array type as integer while having a NaN value

前端 未结 8 965
粉色の甜心
粉色の甜心 2020-11-22 06:05

Is there a preferred way to keep the data type of a numpy array fixed as int (or int64 or whatever), while still having an element ins

相关标签:
8条回答
  • 2020-11-22 06:33

    This capability has been added to pandas (beginning with version 0.24): https://pandas.pydata.org/pandas-docs/version/0.24/whatsnew/v0.24.0.html#optional-integer-na-support

    At this point, it requires the use of extension dtype Int64 (capitalized), rather than the default dtype int64 (lowercase).

    0 讨论(0)
  • 2020-11-22 06:38

    If performance is not the main issue, you can store strings instead.

    df.col = df.col.dropna().apply(lambda x: str(int(x)) )
    

    Then you can mix then with NaN as much as you want. If you really want to have integers, depending on your application, you can use -1, or 0, or 1234567890, or some other dedicated value to represent NaN.

    You can also temporarily duplicate the columns: one as you have, with floats; the other one experimental, with ints or strings. Then inserts asserts in every reasonable place checking that the two are in sync. After enough testing you can let go of the floats.

    0 讨论(0)
提交回复
热议问题