Simple counter example using mapreduce in Google App Engine

后端 未结 2 1520
小蘑菇
小蘑菇 2021-02-06 02:31

I\'m somewhat confused with the current state of mapreduce support in GAE. According to the docs http://code.google.com/p/appengine-mapreduce/ reduce phase isn\'t supported yet,

2条回答
  •  后悔当初
    2021-02-06 03:05

    You don't really need a reduce phase. You can accomplish this with a linear task chain, more or less as follows:

    def count_colors(limit=100, totals={}, cursor=None):
      query = Car.all()
      if cursor:
        query.with_cursor(cursor)
      cars = query.fetch(limit)
      for car in cars:
        try:
          totals[car.color] += 1
        except KeyError:
          totals[car.color] = 1
      if len(cars) == limit:
        cursor = query.cursor()
        return deferred.defer(count_colors, limit, totals, cursor)
      entities = []
      for color in totals:
        entity = CarsByColor(key_name=color)
        entity.cars_num = totals[color]
        entities.append(entity)
      db.put(entities)
    
    deferred.defer(count_colors)
    

    This should iterate over all your cars, pass a query cursor and a running tally to a series of ad-hoc tasks, and store the totals at the end.

    A reduce phase might make sense if you had to merge data from multiple datastores, multiple models, or multiple indexes in a single model. As is I don't think it would buy you anything.

    Another option: use the task queue to maintain live counters for each color. When you create a car, kick off a task to increment the total for that color. When you update a car, kick off one task to decrement the old color and another to increment the new color. Update counters transactionally to avoid race conditions.

提交回复
热议问题