What is the most clever and easy approach to sync data between multiple entities?

后端 未结 3 2000
甜味超标
甜味超标 2021-01-29 18:21

In today’s world where a lot of computers, mobile devices or web services share data or act like hubs, syncing gets more important. As we all know solutions that sync aren’t the

3条回答
  •  温柔的废话
    2021-01-29 18:46

    Where I work we have developed an "offline" version of our main (web) application for users to be able to work on their laptops in locations where they do not have internet access (I'm not sure how many of these places actually exist these days, but I've been told they do ;)). When the user comes back to the main site they need to synchronise the data they entered offline with our main application.

    So, to answer your questions:

    • What is the most recent data? How do I want to represent it?

    We have a LAST_UPDATED_DATE column on every table. The server keeps a track of when synchronisations take place, so when the offline application requests a synchronisation the server says "hey, only give me data changed since this date".

    • What do I do in case of a conflict? Merge? Do I prompt and ask the user what to do?

    In our case the offline application is only able to update a relatively small subset of all the data. As each record is synchronised we check if it is one of these cases, and if so then we compare the LAST_UPDATED_DATE for the record both online and offline. If the dates are different then we also check the values (because it's not a conflict if they're both updated to the same value). If there is a conflict we record the difference, set a flag to say there is at least one conflict, and carry on checking the rest of the details. Once the process is finished then if the "isConflict" flag is set the user is able to go to a special page which displays the differences and decide which data is the "correct" version. This version is then saved on the host and the "isConflict" flag is reset.

    • What do I have to do when I don’t want to get into an inconsistent state?
    • How do I resume a current sync that got interrupted?

    Well, we try to avoid getting into an inconsistent state in the first place. If a synchronistaion is interrupted for any reason then the last_synchronisation_date is not updated, and so the next time a synchronisation is started it will start from the same date as the start date for the previous (interuppted) synchronisation.

    • How do I handle data storage (e.g. MySQL database on a web service, Core Data on an iPhone; and how do I merge/sync the data without a lot of glue code)?

    We use standard databases on both applications, and Java objects in between. The objects are serialised to XML (and gzipped to speed up the transfer) for the actual synchronisation process, then decompressed/deserialised at each end.

    • How should I handle edits from the user that happen during the sync (which runs in the background, so the UI isn’t blocked)?

    These edits would take place after the synchronisation start date, and so would not be picked up on the other side until the next synchronisation.

    • How and in which direction do I propagate changes (e.g. a user creates a „Foo“ entry on his computer and doesn’t sync; then he’s on the go and creates another „Foo“ entry; what happens when he tries to sync both devices)? Will the user have two „Foo“ entries with different unique IDs? Will the user have only one entry, but which one?

    That's up to you to decide how you want to handle this particular Foo... i.e. depending on what the primary key of Foo is and how you determine whether one Foo is the same as another.

    • How should I handle sync when I have hierarchical data? Top-down? Bottom-up? Do I treat every entry atomically or do I only look at a supernode?

    The synchronisation is atomic, so if one record fails then the whole process is marked as incomplete, similar to a subversion commit transaction.

    • How big is the trade-off between oversimplifying things and investing too much time into the implementation?

    I'm not sure exactly what you mean, but I'd say it all depends on your situation and the type / quantity of data you want to sync. It might take a long time to design and implement the process, but it's possible.

    Hope that helps you or at least gives you a few ideas! :)

提交回复
热议问题