Pyspark 'NoneType' object has no attribute '_jvm' error

后端 未结 1 671
無奈伤痛
無奈伤痛 2020-11-27 23:52

I was trying to print total elements in each partitions in a DataFrame using spark 2.2

from pyspark.sql.function         


        
相关标签:
1条回答
  • 2020-11-28 00:16

    This is a great example of why you shouldn't use import *.

    The line

    from pyspark.sql.functions import *
    

    will bring in all the functions in the pyspark.sql.functions module into your namespace, include some that will shadow your builtins.

    The specific issue is in the count_elements function on the line:

    n = sum(1 for _ in iterator)
    #   ^^^ - this is now pyspark.sql.functions.sum
    

    You intended to call __builtin__.sum, but the import * shadowed the builtin.

    Instead, do one of the following:

    import pyspark.sql.functions as f
    

    Or

    from pyspark.sql.functions import sum as sum_
    
    0 讨论(0)
提交回复
热议问题