Using typesafe config with Spark on Yarn

前端 未结 2 1113
[愿得一人]
[愿得一人] 2021-02-06 11:20

I have a Spark job that reads data from a configuration file. This file is a typesafe config file.

The code that reads the config looks like that:

Config         


        
2条回答
  •  梦毁少年i
    2021-02-06 11:41

    Even though, it is a question from a year ago, I had a simmilar issue with the ConfigFactor. To be able to read application.conf file, you have to do two things.

    • Submit the file to the driver. This is done with the following code --files /path/to/file/application.conf. Note that you can read it from HDFS if you wish.
    • Submit the com.typesafe.config package. This is done with --packages com.typesafe:config:version.

    Since the application.conf file will be at the same temporary directory than the main jar aplication, you can assume in your code.

    Using the answer gave above (https://stackoverflow.com/a/40586476/6615465), the code for this question will be the following:

    LOG4J_FULL_PATH=/log4j-path
    ROOT_DIR=/application.conf-path
    
    /opt/deploy/spark/bin/spark-submit \
    --packages com.typesafe:config:1.3.2
    --class com.mycompany.Main \
    --master yarn \
    --deploy-mode cluster \
    --files "$ROOT_DIR/application.conf, $LOG4J_FULL_PATH/log4j.xml" \
    --conf spark.executor.extraClassPath="-Dconfig.file=file:application.conf" \
    --driver-class-path $ROOT_DIR/application.conf \
    --verbose \
    /opt/deploy/lal-ml.jar
    

提交回复
热议问题