How does Google Docs deal with editing collisions?

前端 未结 1 1203
予麋鹿
予麋鹿 2021-01-30 14:08

I\'ve been toying around with writing my own Javascript editor, with functionality similar to Google Docs (allowing multiple people to work on it at the same time). One thing I

1条回答
  •  星月不相逢
    2021-01-30 14:42

    There are different possibilities for realizing concurrent changing of replicas, depending on the scenario's topology and with different trade-offs.

    Using a central server

    The most common scenario is a central server that all clients have to communicate with.

    The server could keep track of how the document of each participant looks. Both A and B then send a diff with their changes to the server. The server would then apply the changes to the respective tracking documents. Then it would perform a three-way-merge and apply the changes to the master document. It would then send the diff between the master document and the tracking documents to the respective clients. This is called differential synchronization.

    A different approach is called operation(al) transformation, which is similar to rebasing in traditional version control systems. It doesn't require a central server, but having one makes things much easier if you have more than 2 participants (see the OT FAQ). The gist is that you transform the changes in one edit so that the edit assumes that the changes of another edit already happened. E.g. A would transform B's edit insert(3, hello) against its edit insert(0, abc) with the result insert(6, hello).

    The difference between rebasing and OT is that rebasing doesn't guarantee consistency if you apply edits in different orders (e.g. if B were to rebase A's edit against theirs the other way around, this can lead to diverging document states). The promise of OT on the other hand is to allow any order if you do the right transformations.

    No central server

    OT algorithms exist that can deal with peer-to-peer scenarios (with the trade-off of increased implementation complexity on the control layer and increased memory usage). Instead of a simple timestamp, a Version vector can be used to keep track of the state an edit is based on. Then (depending on the capability of your OT algorithm, specifically transform property 2), incoming edits can be transformed to fit in the order they are received, or the version vector can be used to impose a partial order on the edits -- in this case history needs to be "rewritten", by undoing and transforming edits, so that they adhere to the order imposed by the version vectors.

    Lastly, there are a group of algorithms based on CRDT, called WOOT, Treedoc or Logoot, which try to solve the problem with specially designed data types that allow operations to commute, so the order in which they are applied doesn't matter (this is similar to your idea of an ID for each character). The trade-offs here are memory consumption and overhead in operation construction.

    0 讨论(0)
提交回复
热议问题