I am creating an EMR cluster, and using jupyter notebook to run some spark tasks. My tasks die after approximately 1 hour of execution, and the error is:
An error was encountered:
Invalid status code '400' from https://xxx.xx.x.xxx:18888/sessions/0/statements/20 with error payload: "requirement failed: Session isn't active."
My understanding is that it is related to the Livy config livy.server.session.timeout
, but I don't know how I can set it in the bootstrap of the cluster (I need to do it in the bootstrap because the cluster is created with no ssh access)
Thanks a lot in advance
On EMR, livy-conf
is the classification for the properties for livy's livy.conf
file, so when creating an EMR cluster, choose advanced options
with Livy as an application chosen to install, please pass this EMR configuration in the Enter Configuration
field.
[{'classification': 'livy-conf','Properties': {'livy.server.session.timeout':'5h'}}]
- On EMR, Livy binary is located at
/etc/livy/
, and so the config file is at/etc/livy/conf/livy.conf
To verify this,
- Create an EMR cluster with a known
ec2 key-pair
, Livy and above config - Using the ec2
key-pair
, login to the EC2 Master node associated with the clusterssh -i some-ec2-key-pair.pem hadoop@ec2-00-00-00-0.ca-region-n.compute.amazonaws.com
- Navigate to
/etc/livy/conf
, vimlivy.conf
& see the updated value oflivy.server.session.timeout
Another way to do that if you don’t want to recreate the cluster is:
go to /etc/livy/conf/livy.conf
and set the livy.server.session.timeout
property to the value you would like.
After that, run sudo restart livy-server
to make the configuration applied.
If you don't want the Livy session to go down at all, then set the property livy.server.session.timeout-check
to false
in /etc/livy/conf/livy.conf
.
来源:https://stackoverflow.com/questions/54220381/how-to-set-livy-server-session-timeout-on-emr-cluster-boostrap