I\'ve encountered several examples of SparkAction jobs in Oozie, and most of them are in Java. I edit a little and run the example in Cloudera CDH Quickstart 5.4.0 (with Spark v
I was able to "fix" this issue although it leads to another issue. Nonetheless, I will still post it.
In stderr of the Oozie container logs, it shows:
Error: Only local python files are supported
And I found a solution here
This is my initial workflow.xml:
${resourceManager}
${nameNode}
local[2]
client
${name}
my_pyspark_job.py
What I did initially was to copy to HDFS the Python script I wish to run as spark-submit job. It turns out that it expects the .py script in the local file system, so I what I did was to refer to the absolute local file system of my script.
//my_pyspark_job.py