问题
I want to run workflow based on availability of Control files for previous date. Date format in my directory is ${basePath}/YYYYMMdd/00/_Complete.I want to check the _Complete file inside my 00. My Job will run daily on the previous day data. I tried the options provided in similar questions But still not working. When I am testing it for same day data with below value for instance , it is working But not with (-1) option. Is there any restriction on URI-TEMPLATE formats, meaning do we need to have it in fixed format path/${YEAR}${$MONTH}${DAY}/Complete Please help.
<instance>${coord:current(0)}</instance>
Here is the dryrun output for my Coordinator job.
***coordJob after parsing: ***
<coordinator-app xmlns="uri:oozie:coordinator:0.1" name="my_Scheduler_5f" frequency="1" start="2016-08-17T23:40Z" end="2016-08-19T23:45Z" timezone="America/Los_Angeles" freq_timeunit="DAY" end_of_duration="NONE">
<controls>
<timeout>30</timeout>
</controls>
<input-events>
<data-in name="coordInput_1" dataset="input1">
<dataset name="input1" frequency="1" initial-instance="2016-08-17T00:00Z" timezone="America/Los_Angeles" freq_timeunit="DAY" end_of_duration="NONE">
<uri-template>${nameNode}/myHdfsPath/Finalpath1/${YEAR}${MONTH}${DAY}/00/</uri-template>
<done-flag>_Complete</done-flag>
</dataset>
<instance>${coord:current(-1)}</instance>
</data-in>
<data-in name="coordInput_2" dataset="input2">
<dataset name="input2" frequency="1" initial-instance="2016-08-17T23:00Z" timezone="America/Los_Angeles" freq_timeunit="DAY" end_of_duration="NONE">
<uri-template>${nameNode}/myHdfsPath/Finalpath2/${YEAR}${MONTH}${DAY}/00/</uri-template>
<done-flag>_Complete</done-flag>
</dataset>
<instance>${coord:current(-1)}</instance>
</data-in>
</input-events>
<action>
<workflow>
<app-path>${nameNode}/myHdfsPath/My_POC/wf-app-dir</app-path>
<configuration>
<property>
<name>date</name>
<value>${coord:formatTime(coord:dateOffset(coord:actualTime(),-1,'DAY'), "yyyyMMdd")}</value>
</property>
</workflow>
</action>
</coordinator-app>
***actions for instance***
回答1:
I was able to get my job to look for the right _Complete
flag using separate <datasets>
and <input-events>
.
<datasets>
<dataset name="input1" frequency="1" initial-instance="2016-08-17T00:00Z" timezone="America/Los_Angeles" freq_timeunit="DAY" end_of_duration="NONE">
<uri-template>${nameNode}/myHdfsPath/Finalpath1/${YEAR}${MONTH}${DAY}/00/</uri-template>
<done-flag>_Complete</done-flag>
</dataset>
... input2 ...
</datasets>
<input-events>
<data-in name="coordInput_1" dataset="input1">
<instance>${coord:current(-1)}</instance>
</data-in>
... coordInput_2 ...
</input-events>
current(-1)
is the part which specifies yesterday (for a daily dataset). In my case, the problem was that I'd copied an example with current(0)
.
来源:https://stackoverflow.com/questions/39008770/how-to-configure-oozie-coordinator-dataset-for-previous-day