pyspark - groupby multiple columns/count performance

后端 未结 0 1510
醉酒成梦
醉酒成梦 2020-12-23 00:47

I have the following statement that is taking hours to execute on a large dataframe (billions of records). I read that groupby is expensive and needs to be avoided .Our spar

相关标签:
回答
  • 消灭零回复
提交回复
热议问题