Apache Livy cURL not working for spark-submit command

白昼怎懂夜的黑 提交于 2019-12-08 20:05:39
hdfs://localhost:9001/jar/project.jar.

It's expecting your jar file located on hdfs.

If it's local, maybe you should try to specify protocol in a path, or just upload that into hdfs:

 "file": "file:///absolute_path/jar/project.jar",

You have to make a fat jar file with your codebase + necessary jar - sbt assembly or use maven plugin, upload this jar file to HDFS and run spark-submit with this jar file which is placed on HDFS or you can use cURL as well.

Steps with Scala/Java:

  1. Make fat jar with SBT/Maven or whatever.
  2. Upload fat jar to HDFS
  3. Use cURL for submitting jobs:

curl -X POST --data '{ //your data should be here}' -H "Content-Type: plication/json" your_ip:8998/batches

If you don't want to make a fat jar file and upload it to HDFS, you can consider python scripts, it could be submitted like a plain text without any jar file.

The example with plain python code:

curl your_ip:8998/sessions/0/statements -X POST -H 'Content-Type: application/json' -d '{"code":"print(\"asdf\")"}'

In data body, you have to send valid Python code. It's a way in which tools like Jupyter Notebook/Torch works.

Also, I made one more example with Livy and Python. For checking results:

curl your_ip:8998/sessions/0/statements/1

As I mentioned above, for Scala/Java fat jar and uploading to HDFS are required.

Kanan Totawar

To use local files for livy batch jobs you need to add the local folder to the livy.file.local-dir-whitelist property in livy.conf.

Description from livy.conf.template:

List of local directories from where files are allowed to be added to user sessions. By default it's empty, meaning users can only reference remote URIs when starting their sessions.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!