How to get the number of workers(executors) in PySpark?

后端 未结 1 684
面向向阳花
面向向阳花 2021-02-07 08:16

I need to use this parameter, so how can I get the number of workers? Like in Scala, I can call sc.getExecutorMemoryStatus to get the available number of workers. B

1条回答
  •  长情又很酷
    2021-02-07 08:19

    In scala, getExecutorStorageStatus and getExecutorMemoryStatus both return the number of executors including driver. like below example snippet

    /** Method that just returns the current active/registered executors
            * excluding the driver.
            * @param sc The spark context to retrieve registered executors.
            * @return a list of executors each in the form of host:port.
            */
           def currentActiveExecutors(sc: SparkContext): Seq[String] = {
             val allExecutors = sc.getExecutorMemoryStatus.map(_._1)
             val driverHost: String = sc.getConf.get("spark.driver.host")
             allExecutors.filter(! _.split(":")(0).equals(driverHost)).toList
           }
    

    But In python api it was not implemented

    @DanielDarabos answer also confirms this.

    The equivalent to this in python...

    sc.getConf().get("spark.executor.instances")
    

    Edit (python) :

    which might be sc._conf.get('spark.executor.instances')

    0 讨论(0)
提交回复
热议问题