How to configure Apache Spark random worker ports for tight firewalls?

二次信任 提交于 2019-11-28 08:45:28

check here https://spark.apache.org/docs/latest/configuration.html#networking

In the "Networking" section, you can see some of the ports are by default random. You can set them to your choice like below:

val conf = new SparkConf() 
    .setMaster(master) 
    .setAppName("namexxx") 
    .set("spark.driver.port", "51810") 
    .set("spark.fileserver.port", "51811") 
    .set("spark.broadcast.port", "51812") 
    .set("spark.replClassServer.port", "51813") 
    .set("spark.blockManager.port", "51814") 
    .set("spark.executor.port", "51815") 

Update for Spark 2.x


Some libraries have been rewritten from scratch and many legacy *.port properties are now obsolete (cf. SPARK-10997 / SPARK-20605 / SPARK-12588 / SPARK-17678 / etc)

For Spark 2.1, for instance, the port ranges on which the driver will listen for executor traffic are

  • between spark.driver.port and spark.driver.port+spark.port.maxRetries
  • between spark.driver.blockManager.port and spark.driver.blockManager.port+spark.port.maxRetries

And the port range on which the executors will listen for driver traffic and/or other executors traffic is

  • between spark.blockManager.port and spark.blockManager.port+spark.port.maxRetries

The "maxRetries" property allows for running several Spark jobs in parallel; if the base port is already used, then the new job will try the next one, etc, unless the whole range is already used.

Source:
   https://spark.apache.org/docs/2.1.1/configuration.html#networking
   https://spark.apache.org/docs/2.1.1/security.html under "Configuring ports"

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!