pandas df.loc[z,x]=y how to improve speed?

前端 未结 3 1273
陌清茗
陌清茗 2020-12-31 08:23

I have identified one pandas command

timeseries.loc[z, x] = y

to be responsible for most of the time spent in an iteration. And now I am l

3条回答
  •  囚心锁ツ
    2020-12-31 09:16

    if you are adding rows inside a loop consider thses performance issues; for around first 1000 to 2000 records "my_df.loc" performance is better and gradually it is become slower by increasing the number of records in loop.

    If you plan to do thins inside a big loop(say 10M‌ records or so) you are better to use a mixture of "iloc" and "append"; fill a temp datframe with iloc untill the size gets around 1000, then append it to the original dataframe, and empy the temp dataframe. this would boost your performance around 10 times

提交回复
热议问题