How to resolve external packages with spark-shell when behind a corporate proxy?

前端 未结 7 1113
有刺的猬
有刺的猬 2020-12-15 07:20

I would like to run spark-shell with a external package behind a corporate proxy. Unfortunately external packages passed via --packages option are not resolved.

相关标签:
7条回答
  • 2020-12-15 07:51

    Found the correct settings:

    bin/spark-shell --conf "spark.driver.extraJavaOptions=-Dhttp.proxyHost=<proxyHost> -Dhttp.proxyPort=<proxyPort> -Dhttps.proxyHost=<proxyHost> -Dhttps.proxyPort=<proxyPort>" --packages <somePackage>
    

    Both http and https proxies have to be set as extra driver options. JAVA_OPTS does not seem to do anything.

    0 讨论(0)
  • 2020-12-15 07:51

    Add

    spark.driver.extraJavaOptions=-Dhttp.proxyHost=<proxyHost> -Dhttp.proxyPort=<proxyPort> -Dhttps.proxyHost=<proxyHost> -Dhttps.proxyPort=<proxyPort>
    

    to $SPARK_HOME/conf/spark-defaults.conf works for me.

    0 讨论(0)
  • 2020-12-15 08:01

    This worked for me in spark 1.6.1:

    bin\spark-shell --driver-java-options "-Dhttp.proxyHost=<proxyHost> -Dhttp.proxyPort=<proxyPort> -Dhttps.proxyHost=<proxyHost> -Dhttps.proxyPort=<proxyPort>" --packages <package>
    
    0 讨论(0)
  • 2020-12-15 08:03

    If you need authentication to use proxy, you can use below in default conf file:

    spark.driver.extraJavaOptions  -Dhttp.proxyHost= -Dhttp.proxyPort= -Dhttps.proxyHost= -Dhttps.proxyPort= -Dhttp.proxyUsername= -Dhttp.proxyPassword= -Dhttps.proxyUsername= -Dhttps.proxyPassword= 
    
    0 讨论(0)
  • 2020-12-15 08:05

    If proxy is correctly configured on your OS, you can use the java property: java.net.useSystemProxies:

    --conf "spark.driver.extraJavaOptions=-Djava.net.useSystemProxies=true"

    so proxy host / port and no-proxy hosts will be configured.

    0 讨论(0)
  • 2020-12-15 08:05

    Was struggling with pyspark till I found this:

    Adding on to @Tao Huang's answer:

    bin/pyspark --driver-java-options="-Dhttp.proxyUser=user -Dhttp.proxyPassword=password -Dhttps.proxyUser=user -Dhttps.proxyPassword=password -Dhttp.proxyHost=proxy -Dhttp.proxyPort=port -Dhttps.proxyHost=proxy -Dhttps.proxyPort=port" --packages [groupId:artifactId]

    I.e. should be -Dhttp(s).proxyUser instead of ...proxyUsername

    0 讨论(0)
提交回复
热议问题