How to run zeppelin notebook from command line (automatically)

前端 未结 2 875
别那么骄傲
别那么骄傲 2021-01-12 23:50
  1. How do we run the notebook from command line?

  2. Further to 1, how would I pass command line arguments into the notebook? I.e. access the command line

相关标签:
2条回答
  • 2021-01-13 00:43

    So I had the same issue and managed to work out how to use the API to run a notebook using curl. As for passing in command line arguments think there is simply no way to do that - you will have to use some sort of shared state on the server (e.g. have the notebook read from a file, and modify the file).

    Anyway this is how I managed to run a notebook, it assumes jq is installed. Pretty involved :(

    curl -XGET http://${ip}:8080/api/interpreter/setting | jq '.body[] | .id'
    
    interpreter_settings_ids=`curl -XGET http://${ip}:8080/api/interpreter/setting | jq '.body[] | .id'`
    
    id_array="["`echo ${interpreter_settings_ids} | tr ' ' ','`"]"
    
    curl -XPUT -d $id_array http://${ip}:8080/api/notebook/interpreter/bind/${notebook_id}
    
    curl -XPOST http://${ip}:8080/api/notebook/job/${notebook_id}
    

    If someone has manually clicked the "save" button for the interpreter binding then only the last command is required.

    UPDATE:

    OK I think you can loop to probe the status of the running notebook to determine if the notebook failed, see: https://github.com/eBay/Zeppelin/blob/master/docs/rest-api/rest-notebook.md

    For example

    function job_success {
        num_cells=`curl -XGET http://${ip}:8080/api/notebook/job/${notebook_id} 2>/dev/null | jq '.body[] | .status' | wc -l`
        num_successes=`curl -XGET http://${ip}:8080/api/notebook/job/${notebook_id} 2>/dev/null | jq '.body[] | .status' | grep FINISHED | wc -l`
        test ${num_cells} = ${num_successes}
    }
    
    function job_fail {
        curl -XGET http://${ip}:8080/api/notebook/job/${notebook_id} 2>/dev/null | jq '.body[] | .status' | grep ERROR
    }
    
    until job_success || job_fail
    do
        sleep 10
    done
    
    0 讨论(0)
  • 2021-01-13 00:44

    As of version 0.7.3 and perhaps earlier, Zeppelin has a REST API that lets you run notebooks. Your shell script can use curl to access the API.

    The API includes methods to delete a paragraph and to insert a paragraph at a particular index. This allows you to express all your "parameters" as variables in paragraph 0 and then use them in later paragraphs. Make 3 calls to the REST API in this order:

    1. Delete the notebook's current paragraph 0.
    2. Insert a new paragraph containing variable assignments at index 0.
    3. Run the notebook.
    0 讨论(0)
提交回复
热议问题