How to list RDDs defined in Spark shell?

前端 未结 1 1387
一个人的身影
一个人的身影 2021-01-18 15:57

In both \"spark-shell\" or \"pyspark\" shells, I created many RDDs but I could not find any way through which I can list all the available RDDs in my current session of Spar

相关标签:
1条回答
  • 2021-01-18 16:38

    In Python you can simply try to filter globals by type:

    def list_rdds():
        from pyspark import RDD
        return [k for (k, v) in globals().items() if isinstance(v, RDD)]
    
    list_rdds()
    # []
    
    rdd = sc.parallelize([])
    list_rdds()
    # ['rdd']
    

    In Scala REPL you should be able to use $intp.definedTerms / $intp.typeOfTerm in a similar way.

    0 讨论(0)
提交回复
热议问题