Submitting spark Jobs over livy using curl

旧时模样 提交于 2020-01-11 12:57:47

问题


I'm submitting spark jobs on a livy (0.6.0) session through Curl

The jobs are a big jar file that extends the Job interface just exactly like this : https://stackoverflow.com/a/49220879/8557851

Actually when running this code using this curl command :

curl -X POST -d '{"kind": "spark","files":["/config.json"],"jars":["/myjar.jar"],"driverMemory":"512M","executorMemory":"512M"}' -H "Content-Type: application/json" localhost:8998/sessions/

When it comes to the code it is exactly like the answer shown above :

package com.mycompany.test
import org.apache.livy.{Job, JobContext}
import org.apache.spark._
import org.apache.livy.scalaapi._

object Test extends Job[Boolean]{
  override def call(jc: JobContext): Boolean = {
  val sc = jc.sc
  sc.getConf.getAll.foreach(println)
  return true
}

As for the error it is a java Nullpointer exception as shown below

Exception in thread "main" java.lang.NullPointerException
    at org.apache.livy.rsc.driver.JobWrapper.cancel(JobWrapper.java:90)
    at org.apache.livy.rsc.driver.RSCDriver.shutdown(RSCDriver.java:127)
    at org.apache.livy.rsc.driver.RSCDriver.run(RSCDriver.java:356)
    at org.apache.livy.rsc.driver.RSCDriverBootstrapper.main(RSCDriverBootstrapper.java:93)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

as the output excepted is to start running the job in the jar


回答1:


I've used livy REST apis and with respect to that there are 2 approaches to submit spark job. Please refer rest api docs, you will get fair understanding livy rest requests.:

1. Batch (/batches) :
You submit request, you get job id. Based on job id you poll for status of spark job. Here you have option to execute uber jar as well as code file but I've never used latter

2. Session (/sessions and /sessions/{sessionId}/statements):
You submit spark job as code, no need to create uber jar. Here, you first create a Session and in this session you execute Statement/s (actual code)

For both the approaches, if you check documentation it has nice explanation about corresponding rest requests and request body/parameters.

Examples/Samples are here, here

Correction to your code, would be:

Batch

curl \
  -X POST \
  -d '{
    "kind": "spark",
    "files": [
      "<use-absolute-path>"
    ],
    "file": "absolute-path-to-your-application-jar",
    "className": "fully-qualified-spark-class-name",
    "driverMemory": "512M",
    "executorMemory": "512M",
    "conf": {<any-other-configs-as-key-val>}
  }' \
  -H "Content-Type: application/json" \
  localhost:8998/batches/

Session and Statement

// Create a session
curl \
  -X POST \
  -d '{
    "kind": "spark",
    "files": [
      "<use-absolute-path>"
    ],
    "driverMemory": "512M",
    "executorMemory": "512M",
    "conf": {<any-other-configs-as-key-val>}
  }' \
  -H "Content-Type: application/json" \
  localhost:8998/sessions/

// Run code/statement in session created above
curl \
  -X POST \
  -d '{
    "kind": "spark",
    "code": "spark-code"
  }' \
  -H "Content-Type: application/json" \
  localhost:8998/sessions/{sessionId}/statements


来源:https://stackoverflow.com/questions/57754668/submitting-spark-jobs-over-livy-using-curl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!