How to resolve external packages with spark-shell when behind a corporate proxy?

和自甴很熟 提交于 2019-11-30 09:02:43

Found the correct settings:

bin/spark-shell --conf "spark.driver.extraJavaOptions=-Dhttp.proxyHost=<proxyHost> -Dhttp.proxyPort=<proxyPort> -Dhttps.proxyHost=<proxyHost> -Dhttps.proxyPort=<proxyPort>" --packages <somePackage>

Both http and https proxies have to be set as extra driver options. JAVA_OPTS does not seem to do anything.

This worked for me in spark 1.6.1:

bin\spark-shell --driver-java-options "-Dhttp.proxyHost=<proxyHost> -Dhttp.proxyPort=<proxyPort> -Dhttps.proxyHost=<proxyHost> -Dhttps.proxyPort=<proxyPort>" --packages <package>

If proxy is correctly configured on your OS, you can use the java property: java.net.useSystemProxies:

--conf "spark.driver.extraJavaOptions=-Djava.net.useSystemProxies=true"

so proxy host / port and no-proxy hosts will be configured.

Add

spark.driver.extraJavaOptions=-Dhttp.proxyHost=<proxyHost> -Dhttp.proxyPort=<proxyPort> -Dhttps.proxyHost=<proxyHost> -Dhttps.proxyPort=<proxyPort>

to $SPARK_HOME/conf/spark-defaults.conf works for me.

Tao Huang

If you need authentication to use proxy, you can use below in default conf file:

spark.driver.extraJavaOptions  -Dhttp.proxyHost= -Dhttp.proxyPort= -Dhttps.proxyHost= -Dhttps.proxyPort= -Dhttp.proxyUsername= -Dhttp.proxyPassword= -Dhttps.proxyUsername= -Dhttps.proxyPassword= 
chaooder

Was struggling with pyspark till I found this:

Adding on to @Tao Huang's answer:

bin/pyspark --driver-java-options="-Dhttp.proxyUser=user -Dhttp.proxyPassword=password -Dhttps.proxyUser=user -Dhttps.proxyPassword=password -Dhttp.proxyHost=proxy -Dhttp.proxyPort=port -Dhttps.proxyHost=proxy -Dhttps.proxyPort=port" --packages [groupId:artifactId]

I.e. should be -Dhttp(s).proxyUser instead of ...proxyUsername

On windows 7 with spark-2.0.0-bin-hadoop2.7 I set the spark.driver.extraJavaOptions in %SPARK_HOME%"\spark-2.0.0-bin-hadoop2.7\conf\spark-defaults.conf like:

spark.driver.extraJavaOptions -Dhttp.proxyHost=hostname -Dhttp.proxyPort=port -Dhttps.proxyHost=host -Dhttps.proxyPort=port
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!