How to get path to the uploaded file

后端 未结 1 1763
时光说笑
时光说笑 2020-11-29 10:05

I am running an spark cluster on google cloud and I upload a configuration file with each job. What is the path to a file that is uploaded with a submit command?

In

相关标签:
1条回答
  • 2020-11-29 10:14

    Local path to a file distributed using SparkFiles mechanism (--files argument, SparkContext.addFile) method can be obtained using SparkFiles.get:

    org.apache.spark.SparkFiles.get(fileName)
    

    You can also get the path to the root directory using SparkFiles.getRootDirectory:

    org.apache.spark.SparkFiles.getRootDirectory
    

    You can use these combined with standard IO utilities to read the files.

    how can I read the file Configuration.properties before the SparkContext has been initialized?

    SparkFiles are distributed by the driver, cannot be accessed before context has been initialized, and to be distributed in the first place, have to be accessible from the driver node. So this part of the question solely depends what type of storage you'll use to expose the file to the driver node.

    0 讨论(0)
提交回复
热议问题