I have a difficult problem.
I am iterating through a set of URLs parameterized by date and fetching them. For example, here is an example of one:
somewebserv
Have you read the Pipelines Getting Started docs? Pipelines can create other pipelines and wait on them, so doing what you want is fairly straightforward:
class RecursivePipeline(pipeline.Pipeline):
def run(self, param):
if some_condition: # Too big to process in one
p1 = yield RecursivePipeline(param1)
p2 = yield RecursivePipeline(param2)
yield RecursiveCombiningPipeline(p1, p2)
Where RecursiveCombiningPipeline
simply acts as a receiver for the values of the two sub-pipelines.
Here is an example using Java Pipeline
package com.example;
import com.google.appengine.tools.pipeline.FutureValue;
import com.google.appengine.tools.pipeline.Job1;
import com.google.appengine.tools.pipeline.Job2;
import com.google.appengine.tools.pipeline.Value;
public class PipelineRecursionDemo {
/**
* A Job to count the number of letters in a word
* using recursion
*/
public static class LetterCountJob extends Job1<Integer, String> {
public Value<Integer> run(String word) {
int length = word.length();
if (length < 2) {
return immediate(word.length());
} else {
int mid = length / 2;
FutureValue<Integer> first = futureCall(new LetterCountJob(),
immediate(word.substring(0, mid)));
FutureValue<Integer> second = futureCall(new LetterCountJob(),
immediate(word.substring(mid, length)));
return futureCall(new SumJob(), first, second);
}
}
}
/**
* An immediate Job to add two integers
*/
public static class SumJob extends Job2<Integer, Integer, Integer> {
public Value<Integer> run(Integer x, Integer y) {
return immediate(x + y);
}
}
}
All right, so here's what I did. I had to modify Mitch's solution just a bit, but he definitely got me in the right direction with the advice to return the future value instead of an immediate one.
I had to create an intermidate DummyJob that takes the output of the recursion
public static class DummyJob extends Job1<Void, List<Void>> {
@Override
public Value<Void> run(List<Void> dummies) {
return null;
}
}
Then, I submit the output of the DummyJob to the Blob Finalizer in a waitFor
List<FutureValue<Void>> dummies = new ArrayList<FutureValue<Void>>();
for (Interval in : ins) {
dummies.add(futureCall(new DataFetcher(), immediate(file), immediate(in.getStart()),
immediate(in.getEnd())));
}
FutureValue<Void> fv = futureCall(new DummyJob(), futureList(dummies));
return futureCall(new DataWriter(), immediate(file), waitFor(fv));
Thank you Mitch and Nick!!