问题
I’m trying to use Akka streams to build a pub sub bus in the following way:
Publisher adds a source stream for the topic and subscribers specify a topic and get everything for that topic. However topic may be published by multiple Publishers, and both publisher and subscribers can join at any point.
What I had in mind is to combine all sources and then return the filtered source to a subscriber.
However as publishers may join at any point sources may be added after a subscription has been made and the subscriber needs to get data from it like any other published data for the topic.
Is there a way to manage the merging of streams to a source dynamically such that the following will hold:
publish(topic: String, messages: Source[T])
subscribe(topic: String): Source[T]
Such that regardless of when a publisher is added a subscriber to a topic will get all messages published to any source related to the topic after the subscription is made.
Happy to hear about alternative approaches too.
Thanks, Z
回答1:
You might want to take a look at this Akka doc re: building a dynamic pub-sub service using MergeHub and BroadcastHub.
Here's sample code for using a MergeHub and a BroadcastHub as dynamic fan-in and fan-out junctions, respectively.
The idea is to connect a MergeHub
with a BroadcastHub
to form a pub-sub channel in the form of a Flow via Flow.fromSinkAndSource:
val (bfSink, bfSource) = MergeHub.source[String](perProducerBufferSize).
toMat(BroadcastHub.sink[String](bufferSize))(Keep.both).
run
val busFlow: Flow[String, String, NotUsed] = Flow.fromSinkAndSource(bfSink, bfSource)
Note that Keep.both
in the above snippet produces a Tuple of materialized values (Sink[T, NotUsed], Source[T, NotUsed])
from MergeHub.source[T]
and BroadcastHub.sink[T]
which have the following method signatures:
object MergeHub {
def source[T](perProducerBufferSize: Int): Source[T, Sink[T, NotUsed]] = // ...
// ...
}
object BroadcastHub {
def sink[T](bufferSize: Int): Sink[T, Source[T, NotUsed]] = // ...
// ...
}
Below is sample code for a simple pub-sub channel busFlow
(similar to the example in the Akka doc):
import akka.actor.ActorSystem
import akka.stream._
import akka.stream.scaladsl._
import akka.NotUsed
implicit val system = ActorSystem("system")
implicit val materializer = ActorMaterializer()
implicit val ec = system.dispatcher
val (bfSink, bfSource) = MergeHub.source[String](perProducerBufferSize = 32).
toMat(BroadcastHub.sink[String](bufferSize = 256))(Keep.both).
run
// Optional: avoid building up backpressure when there is no subscribers
bfSource.runWith(Sink.ignore)
val busFlow: Flow[String, String, NotUsed] = Flow.fromSinkAndSource(bfSink, bfSource)
Testing busFlow
:
Source(101 to 103).map(i => s"Batch(A)-$i").
delay(2.seconds, DelayOverflowStrategy.backpressure).
viaMat(busFlow)(Keep.right).
to(Sink.foreach{ case s: String => println("Consumer(1)-" + s) }).
run
Source(104 to 105).map(i => s"Batch(B)-$i").
viaMat(busFlow)(Keep.right).
to(Sink.foreach{ case s: String => println("Consumer(2)-" + s) }).
run
// Consumer(2)-Batch(B)-104
// Consumer(2)-Batch(B)-105
// Consumer(1)-Batch(B)-104
// Consumer(1)-Batch(B)-105
// Consumer(1)-Batch(A)-101
// Consumer(1)-Batch(A)-102
// Consumer(2)-Batch(A)-101
// Consumer(1)-Batch(A)-103
// Consumer(2)-Batch(A)-102
// Consumer(2)-Batch(A)-103
Serving as a pub-sub channel, the input of busFlow
is published via bfSink
to all subscribers while its output streams through bfSource
all the elements published. For example:
val p1 = Source.tick[Int](0.seconds, 5.seconds, 5).map(_.toString)
p1.runWith(bfSink)
val p2 = Source.tick[Int](2.seconds, 10.seconds, 10).map(_.toString)
p2.runWith(bfSink)
val s1 = bfSource
s1.runForeach(x => println(s"s1 --> $x"))
val s2 = bfSource
s2.runForeach(x => println(s"s2 --> $x"))
// s1 --> 5
// s2 --> 5
// s1 --> 10
// s2 --> 10
// s2 --> 5
// s1 --> 5
// s2 --> 5
// s1 --> 5
// s1 --> 10
// s2 --> 10
// s2 --> 5
// s1 --> 5
// ...
Other relevant topics that might be of interest include KillSwitch for stream completion control and PartitionHub for routing Stream elements from a given producer to a dynamic set of consumers.
回答2:
Here's what I ended up doing. Both publishers and subscribers can come and disappear and regardless of when a subscriber joins and when a publisher joins, the subscriber should be able to see all published messages for their subscription (by topic) irrespective of which publishers were active at the time the subscription was made. Comments are welcome.
def main(args: Array[String]): Unit = {
val actorSystem = ActorSystem("test")
val materializerSettings = ActorMaterializerSettings(actorSystem)
implicit val materializer = ActorMaterializer(materializerSettings)(actorSystem)
implicit val ec: ExecutionContext = actorSystem.dispatcher
val (queue, pub) = Source.queue[Int](100, akka.stream.OverflowStrategy.dropHead).toMat(Sink.asPublisher(true))(Keep.both).run()
val p1 = Source.tick[Int](0.seconds, 5.seconds, 5)
p1.runForeach(x=> {queue.offer(x)})
val p2= Source.tick[Int](2.seconds,10.seconds, 10)
p2.runForeach(x=> queue.offer(x))
val s1 = Source.fromPublisher(pub)
s1.runForeach(x=> println(s"s1 =======> ${x}"))
val s2 = Source.fromPublisher(pub)
s2.runForeach(x=> println(s"s2 =======> ${x}"))
}
来源:https://stackoverflow.com/questions/56903332/dynamically-merge-akka-streams