问题
I have a job created using spoon and imported to the DI repository. Without scheduling it using PDI job scheduler how can I run PDI Job on a Data Integration Server using REST web services? So that I can call it whenever I want.
回答1:
Before beginning these steps, please make sure that your Carte server (or Carte server embedded in the DI server) is configured to connect to the repository for REST calls. The process and description can be found on the wiki page. Note that the repositories.xml needs to be defined and in the appropriate location for the DI Server as well.
Method 1 : (Run Job and continue, no status checks):
Start a PDI Job (/home/admin/Job 1):
curl -L "http://admin:password@localhost:9080/pentaho-di/kettle/runJob?job=/home/admin/Job%201" 2> /dev/null | xmllint --format -
Method 2 : (Run Job and poll job status regularly):
Generate a login cookie:
curl -d "j_username=admin&j_password=password&locale=en_US" -c cookies.txt http://localhost:9080/pentaho-di/j_spring_security_check
Check DI Server status:
curl -L -b cookies.txt http://localhost:9080/pentaho-di/kettle/status?xml=Y | xmllint --format -
Result:
<?xml version="1.0" encoding="UTF-8"?>
<serverstatus>
<statusdesc>Online</statusdesc>
<memory_free>850268568</memory_free>
<memory_total>1310720000</memory_total>
<cpu_cores>4</cpu_cores>
<cpu_process_time>22822946300</cpu_process_time>
<uptime>100204</uptime>
<thread_count>59</thread_count>
<load_avg>-1.0</load_avg>
<os_name>Windows 7</os_name>
<os_version>6.1</os_version>
<os_arch>amd64</os_arch>
<transstatuslist>
<transstatus>
<transname>Row generator test</transname>
<id>de44a94e-3bf7-4369-9db1-1630640e97e2</id>
<status_desc>Waiting</status_desc>
<error_desc/>
<paused>N</paused>
<stepstatuslist>
</stepstatuslist>
<first_log_line_nr>0</first_log_line_nr>
<last_log_line_nr>0</last_log_line_nr>
<logging_string><![CDATA[]]></logging_string>
</transstatus>
</transstatuslist>
<jobstatuslist>
</jobstatuslist>
</serverstatus>
Start a PDI Job (/home/admin/Job 1):
curl -L -b cookies.txt "http://localhost:9080/pentaho-di/kettle/runJob?job=/home/admin/Job%201" | xmllint --format -
Result:
<webresult>
<result>OK</result>
<message>Job started</message>
<id>dd419628-3547-423f-9468-2cb5ffd826b2</id>
</webresult>
Check the job's status:
curl -L -b cookies.txt "http://localhost:9080/pentaho-di/kettle/jobStatus?name=/home/admin/Job%201&id=dd419628-3547-423f-9468-2cb5ffd826b2&xml=Y" | xmllint --format -
Result:
<?xml version="1.0" encoding="UTF-8"?>
<jobstatus>
<jobname>Job 1</jobname>
<id>dd419628-3547-423f-9468-2cb5ffd826b2</id>
<status_desc>Finished</status_desc>
<error_desc/>
<logging_string><![CDATA[H4sIAAAAAAAAADMyMDTRNzDUNzJSMDSxMjawMrZQ0FXwyk9SMATSwSWJRSUK+WkKWUCB1IrU5NKSzPw8LiPCmjLz0hVS80qKKhWiXUJ9fSNjSdQUXJqcnFpcTEibW2ZeZnFGagrEgahaFTSKUotLc0pso0uKSlNjNckwCuJ0Eg3yQg4rhTSosVwABykpF2oBAAA=]]></logging_string>
<first_log_line_nr>0</first_log_line_nr>
<last_log_line_nr>13</last_log_line_nr>
<result>
<lines_input>0</lines_input>
<lines_output>0</lines_output>
<lines_read>0</lines_read>
<lines_written>0</lines_written>
<lines_updated>0</lines_updated>
<lines_rejected>0</lines_rejected>
<lines_deleted>0</lines_deleted>
<nr_errors>0</nr_errors>
<nr_files_retrieved>0</nr_files_retrieved>
<entry_nr>0</entry_nr>
<result>Y</result>
<exit_status>0</exit_status>
<is_stopped>N</is_stopped>
<log_channel_id/>
<log_text>null</log_text>
<result-file/>
<result-rows/>
</result>
</jobstatus>
- Get the status description from the jobStatus API:
curl -L -b cookies.txt "http://localhost:9080/pentaho-di/kettle/jobStatus?name=/home/admin/Job%201&id=dd419628-3547-423f-9468-2cb5ffd826b2&xml=Y" 2> /dev/null | xmllint --xpath "string(/jobstatus/status_desc)"
-
Result:
Finished
PS : curl
& libxml2-utils
installed via apt-get
.
The libxml2-utils
package is optional, used solely for formatting XML output from the DI Server. This shows how to start a PDI job using a Bash shell.
Supported in version 5.3 and later.
来源:https://stackoverflow.com/questions/29437154/run-pdi-jobs-using-web-services