KeyError when using boolean filter on pandas data frame

ぐ巨炮叔叔 提交于 2019-11-29 12:46:00

Here

gametaxidf.loc[arrivemask, 'relevant'] = 1

you're trying to set dataframe values by .loc operator. Pandas docs for selecting rows says:

.loc is primarily label based, but may also be used with a boolean array. .loc will raise KeyError when the items are not found. Allowed inputs are:

  • A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index. This use is not an integer position along the index)
  • A list or array of labels ['a', 'b', 'c']
  • A slice object with labels 'a':'f', (note that contrary to usual python slices, both the start and the stop are included!)
  • A boolean array

You're trying to use the last type of input, but this

arrivemask = (arrivemin < row['dropoff_datetime']) and 
    (row['dropoff_datetime'] < arrivemax)

is scalar boolean, not array.

You need not to iterate through dataframe. Pandas does it for you. Just use:

gametaxidf.loc[
   (arrivemin < gametaxidf['dropoff_datetime'])
   &
   (gametaxidf['dropoff_datetime'] < arrivemax)
   , 'relevant'] = 1
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!