pyspark rdd isCheckPointed() is false

前端 未结 1 1672
旧巷少年郎
旧巷少年郎 2021-01-23 23:44

I was encountering stackoverflowerrors when I was iteratively adding over 500 columns to my pyspark dataframe. So, I included checkpoints. The checkpoints did not help. So, I cr

相关标签:
1条回答
  • 2021-01-23 23:49

    The checkpoint method returns a new check-pointed Dataset, it does not modify the current Dataset.

    Change

    df4.checkpoint(eager=True)
    

    To

    df4 = df4.checkpoint(eager=True)
    
    0 讨论(0)
提交回复
热议问题