pyspark - groupby multiple columns/count performance

后端 未结 0 758
说谎
说谎 2020-12-23 00:22

I have the following statement that is taking hours to execute on a large dataframe (billions of records). I read that groupby is expensive and needs to be avoided .Our spar

相关标签:
回答
  • 消灭零回复
提交回复
热议问题