What is the time complexity of .at and .loc in pandas?

半世苍凉 提交于 2021-01-27 22:20:07

问题


I'm looking for the time complexity of these methods as a function of the number of rows in a dataframe, n.

Another way of asking this question is: Are indexes for dataframes in pandas btrees (with log(n) time look ups) or hash tables (with constant time lookups)?

Asking this question because I'd like a way to do constant time look ups for rows in a dataframe based on a custom index.


回答1:


Alright so it would appear that:

1) You can build your own index on a dataframe with .set_index in O(n) time where n is the number of rows in the dataframe

2) The index is lazily initialized and built (in O(n) time) the first time you try to access a row using that index. So accessing a row for the first time using that index takes O(n) time

3) All subsequent row access takes constant time.

So it looks like the indexes are hash tables and not btrees.



来源:https://stackoverflow.com/questions/58876676/what-is-the-time-complexity-of-at-and-loc-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!