Oozie job won't run if using PySpark in SparkAction

后端 未结 4 2010
闹比i
闹比i 2021-02-11 09:23

I\'ve encountered several examples of SparkAction jobs in Oozie, and most of them are in Java. I edit a little and run the example in Cloudera CDH Quickstart 5.4.0 (with Spark v

4条回答
  •  无人及你
    2021-02-11 10:08

    I was able to "fix" this issue although it leads to another issue. Nonetheless, I will still post it.

    In stderr of the Oozie container logs, it shows:

    Error: Only local python files are supported
    

    And I found a solution here

    This is my initial workflow.xml:

        
            ${resourceManager}
            ${nameNode}
            local[2]
            client
            ${name}
            my_pyspark_job.py
        
    

    What I did initially was to copy to HDFS the Python script I wish to run as spark-submit job. It turns out that it expects the .py script in the local file system, so I what I did was to refer to the absolute local file system of my script.

    //my_pyspark_job.py
    

提交回复
热议问题