A Client Walks Into a Server And Asks “What's New?” – Problems With Sequence Numbers

谁都会走 提交于 2019-12-24 07:25:14

问题


I'm looking for a solution to an edge case scenario where a client continually asking the server for what's new will fail.

In this example, I'm not using timestamps because of another edge case problem. That's handled in this question: A Client Walks Into a Server And Asks "What's New?" – Problems With Timestamps

Assume we're using sequence numbers. There's a single sequence number that is atomically updated every time the table is changed. When any row is updated it records the current sequence. Then it's just a matter of the client asking for what's new since the last sequence it asked for. Simple? Yes, but...

Failure scenario:

Sequence starts at 1
1) Client A starts update. Updates sequence to 2
2) Client B starts update. Updates sequence to 3
3) Client B updates rows with sequence 3
4) Client C requests changes >1.  Gets B's changes. Good.
5) Client A updates rows with sequence 2. 
6) Client C requests changes >3.  Gets nothing. Doesn’t get Client A’s changes.

Because we're using a MongoDB, I don't believe we can easily lock the sequence during an update. And, if we could, I worry about performance.

Having clients ask "what's new" repeatedly seems like a common use case and I find it surprising not to find a better wealth of best practices on this.

Any ideas on solving this scenario or recommending a better, preferably platform agnostic, solution for asking for changes?


回答1:


One thing you can do is maintain a heap of sequence numbers that are in use as well as the "next sequence number to assign". Try the following:

  • When you grab a sequence number, put it in an "in-use" map.
  • When you are done making changes with that sequence number, remove it from the "in-use" std::set
  • Keep track of the minimum in the set. Whenever the minimum changes from "x" to "y", have Client C request values from x to y, but no greater than y.

So, in your example, when you update the sequence to 2, 1 is put in the in-use set. Then, when you update to 3, 2 is put in there, and the set contains 1 and 2. When the work for 2 is done, 2 is removed from the set, but client C does not pick up any changes because the minimum, 1, is unchanged. When client A is done with 1, the minumum changes from 1 to 3, and client C may read the changes from 1 to 3.

For a more complex example, suppose you have 6 clients use sequence numbers 11, 12, 13, 14, 15, and 16, yet finish in the following order: 12, 13, 11, 15, 14, 16 (which is the order they are removed from the "in-use" set. In this example, after 11 is gone, client C may read 11 through 13, because the minumum changes from 11 to 14. Then, after 14 is gone, client C may read 14 and 15, as the minimum changes from 14 to 16. Then, when 16 is gone, client C can read 16.

This is basically the algorithm we use in TokuMX replication that decides what oplog entries may be replicated to secondaries. Clients A and B would be threads doing writes to the oplog and Client C would be a tailable cursor from the secondary pulling oplog data.




回答2:


This is clarifying and providing some examples of Zardosht Kasheff's correct answer above.

  • At the start of an update we add the sequence number to an in-use list
  • Any client requesting “what’s new” can’t ask for anything >= the lowest number on the list.
  • Thus if a couple clients were updating with sequences 10, 11, 12, a client would ask for everything from > sequenceOfLastItemReceivedByClient and < 10 (the lowest number on the in use list.)

Here's my original example solved:

Sequence starts at 1
Client A starts update. Uses sequence 2.  In use table now has “2"
Client B starts update. Uses sequence 3. In use table now has “2,3"
Client B updates rows with sequence 3. In use table now has “2"
Client C wants to request >1 .  However, min in-use table is 2.  So it requests >1 and <2 which is none.
Client A updates rows with sequence 2. In use table now empty.
Client C wants to request changes >1. Gets A and B.

And Zardosht's example

Clients start updates with sequences 11, 12, 13, 14, 15, and 16. In use table is "11, 12, 13, 14, 15, 16”
Updates 12, 13, 11 finish.  In use table is “14, 15, 16”
Client C wants to request >1.  Is allowed >1 and <14.  Gets sequences 11, 12, 13
Updates 15, 14, 16 finish.  In use table is empty.
Client C requests >13. Gets 15, 14, 16.

And a final example:

Clients start updates with sequences 11, 12, 13, 14, 15, and 16. In use table is "11, 12, 13, 14, 15, 16”
Updates 12, 16, 11 finish.  In use table is “13, 14, 15”
Client C wants to request >1.  Is allowed >1 and <13.  Gets sequences 11, 12
Updates 13, 14, 15 finish.  In use table is empty.
Client C requests >12. Gets 13, 14, 15, 16

All above examples pass. This seems to be a workable solution.



来源:https://stackoverflow.com/questions/21586779/a-client-walks-into-a-server-and-asks-whats-new-problems-with-sequence-num

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!