I am running Pyspark scripts to write a dataframe to a csv in jupyter Notebook as below:
df.coalesce(1).write.csv(\'Data1.csv\',header = \'true\')
Judging by the output, if your application is not finishing with a FAILED status, that sounds like a Livy timeout error: your application is likely taking longer than the defined timeout for a Livy session (which defaults to 1h), so even despite the Spark app succeeds your notebook will receive this error if the app takes longer than the Livy session's timeout.
If that's the case, here's how to address it:
/etc/livy/conf/livy.conf
file (in the cluster's master
node) livy.server.session.timeout
to a higher value, like 8h (or larger, depending on your app)sudo restart livy-server
in the cluster's master