I\'m running into a strange issue (or intended?) where combine_first
or update
are causing values stored as bool
to be upcasted into
Before updating, the dateframe b
is been filled by reindex_link, so that b becomes
In [5]: b.reindex_like(a)
Out[5]:
a b isBool isBool2
0 45 45 NaN NaN
1 NaN NaN NaN NaN
And then use numpy.where to update the data frame.
The tragedy is that for numpy.where
, if two data have different type, the more general one would be used. For example
In [20]: np.where(True, [True], [0])
Out[20]: array([1])
In [21]: np.where(True, [True], [1.0])
Out[21]: array([ 1.])
Since NaN
in numpy
is floating type, it'll also return an floating type.
In [22]: np.where(True, [True], [np.nan])
Out[22]: array([ 1.])
Therefore, after updating, your 'isBool' and 'isBool2' column become floating type.
I've added this issue on the issue tracker for pandas
this is a bug, update shouldn't touch unspecified columns, fixed here https://github.com/pydata/pandas/pull/3021