Getting the count of records in a data frame quickly

前端 未结 1 1287
梦谈多话
梦谈多话 2021-01-03 18:23

I have a dataframe with as many as 10 million records. How can I get a count quickly? df.count is taking a very long time.

相关标签:
1条回答
  • 2021-01-03 19:03

    It's going to take so much time anyway. At least the first time.

    One way is to cache the dataframe, so you will be able to more with it, other than count.

    E.g

    df.cache()
    df.count()
    

    Subsequent operations don't take much time.

    0 讨论(0)
提交回复
热议问题