问题
I'm using Zeppelin v0.7.3
notebook to run Pyspark
scripts. In one paragraph, I am running script to write data from dataframe
to a parquet
file in a Blob folder. File is partitioned per country. Number of rows of dataframe is 99,452,829
. When the script reaches 1 hour
, an error is encountered -
Error with 400 StatusCode: "requirement failed: Session isn't active.
My default interpreter for the notebook is jdbc
. I have read about timeoutlifecyclemanager
and added in the interpreter setting zeppelin.interpreter.lifecyclemanager.timeout.threshold
and set it to 7200000
but still encountered the error after it reaches 1 hour runtime at 33% processing completion.
I checked the Blob folder after the 1 hr timeout and parquet files were successfully written to Blob which are indeed partitioned per country.
The script I am running to write DF to parquet Blob is below:
trdpn_cntry_fct_denom_df.write.format("parquet").partitionBy("CNTRY_ID").mode("overwrite").save("wasbs://tradepanelpoc@blobasbackupx2066561.blob.core.windows.net/cbls/hdi/trdpn_cntry_fct_denom_df.parquet")
Is this Zeppelin timeout issue? How can it be extended for more than 1 hour runtime? Thanks for the help.
回答1:
The timeout lifecycle manager is available since version 0.8.
Seems there is problem with pyspark. Try this solution Pyspark socket timeout exception after application running for a while
回答2:
From This stack overflow question's answer which worked for me
Judging by the output, if your application is not finishing with a FAILED status, that sounds like a Livy timeout error: your application is likely taking longer than the defined timeout for a Livy session (which defaults to 1h), so even despite the Spark app succeeds your notebook will receive this error if the app takes longer than the Livy session's timeout.
If that's the case, here's how to address it:
1. edit the /etc/livy/conf/livy.conf file (in the cluster's master node)
2. set the livy.server.session.timeout to a higher value, like 8h (or larger, depending on your app)
3. restart Livy to update the setting: sudo restart livy-server in the cluster's master
4. test your code again
来源:https://stackoverflow.com/questions/53275693/timeout-error-error-with-400-statuscode-requirement-failed-session-isnt-act