Akka for REST polling

与世无争的帅哥 提交于 2019-12-03 16:34:54

问题


I'm trying to interface a large Scala + Akka + PlayMini application with an external REST API. The idea is to periodically poll (basically every 1 to 10 minutes) a root URL and then crawl through sub-level URLs to extract data which is then sent to a message queue.

I have come up with two ways to do this:

1st way

Create a hierarchy of actors to match the resource path structure of the API. In the Google Latitude case, that would mean, e.g.

  • Actor 'latitude/v1/currentLocation' polls https://www.googleapis.com/latitude/v1/currentLocation
  • Actor 'latitude/v1/location' polls https://www.googleapis.com/latitude/v1/location
  • Actor 'latitude/v1/location/1' polls https://www.googleapis.com/latitude/v1/location/1
  • Actor 'latitude/v1/location/2' polls https://www.googleapis.com/latitude/v1/location/2
  • Actor 'latitude/v1/location/3' polls https://www.googleapis.com/latitude/v1/location/3
  • etc.

In this case, each actor is responsible for polling its associated resource periodically, as well as creating / deleting child actors for next-level path resources (i.e. actor 'latitude/v1/location' creates actors 1, 2, 3, etc. for all locations it learns about through polling of https://www.googleapis.com/latitude/v1/location).

2nd way

Create a pool of identical polling actors which receive polling requests (containing the resource path) load-balanced by a router, poll the URL once, do some processing, and schedule polling requests (both for next-level resources and for the polled URL). In Google Latitude, that would mean for instance:

1 router, n poller actors. Initial polling request for https://www.googleapis.com/latitude/v1/location leads to several new (immediate) polling requests for https://www.googleapis.com/latitude/v1/location/1, https://www.googleapis.com/latitude/v1/location/2, etc. and one (delayed) polling request for the same resource, i.e. https://www.googleapis.com/latitude/v1/location.

I have implemented both solutions and can't immediately observe any relevant difference of performance, at least not for the API and polling frequencies I am interested in. I find the first approach to be somewhat easier to reason about and perhaps easier to use with system.scheduler.schedule(...) than the second approach (where I need to scheduleOnce(...)). Also, assuming resources are nested through several levels and somewhat short-lived (e.g. several resources may be added/removed between each polling), akka's lifecycle management makes it easy to kill off a whole branch in the 1st case. The second approach should (theoretically) be faster and the code is somewhat easier to write.

My questions are:

  1. What approach seems to be the best (in terms of performance, extensibility, code complexity, etc.)?
  2. Do you see anything wrong with the design of either approach (esp. the 1st one)?
  3. Has anyone tried to implement anything similar? How was it done?

Thanks!


回答1:


Why not create a master poller, which then kicks of async resource requests on the schedule?

I'm no expert using Akka, but I gave this a shot:

The poller object that iterates through the list of resources to fetch:

import akka.util.duration._
import akka.actor._
import play.api.Play.current
import play.api.libs.concurrent.Akka

object Poller {
  val poller = Akka.system.actorOf(Props(new Actor {
    def receive = {
      case x: String => Akka.system.actorOf(Props[ActingSpider], name=x.filter(_.isLetterOrDigit)) ! x
    }
  }))

  def start(l: List[String]): List[Cancellable] =
    l.map(Akka.system.scheduler.schedule(3 seconds, 3 seconds, poller, _))

  def stop(c: Cancellable) {c.cancel()}
}

The actor that reads the resource asynchronously and triggers more async reads. You could put the message dispatch on a schedule rather than call immediately if it was kinder:

import akka.actor.{Props, Actor}
import java.io.File

class ActingSpider extends Actor {
  import context._
  def receive = {
    case name: String => {
      println("reading " + name)
      new File(name) match {
        case f if f.exists() => spider(f)
        case _ => println("File not found")
      }
      context.stop(self)
    }
  }

  def spider(file: File) {
    io.Source.fromFile(file).getLines().foreach(l => {
      val k = actorOf(Props[ActingSpider], name=l.filter(_.isLetterOrDigit))
      k ! l
    })
  }
}


来源:https://stackoverflow.com/questions/10654631/akka-for-rest-polling

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!