How to Sync iPhone Core Data with web server, and then push to other devices?

89,793

Solution 1

I suggest carefully reading and implementing the sync strategy discussed by Dan Grover at iPhone 2009 conference, available here as a pdf document.

This is a viable solution and is not that difficult to implement (Dan implemented this in several of its applications), overlapping the solution described by Chris. For an in-depth, theoretical discussion of syncing, see the paper from Russ Cox (MIT) and William Josephson (Princeton):

File Synchronization with Vector Time Pairs

which applies equally well to core data with some obvious modifications. This provides an overall much more robust and reliable sync strategy, but requires more effort to be implemented correctly.

EDIT:

It seems that the Grover's pdf file is no longer available (broken link, March 2015). UPDATE: the link is available through the Way Back Machine here

The Objective-C framework called ZSync and developed by Marcus Zarra has been deprecated, given that iCloud finally seems to support correct core data synchronization.

Solution 2

I've done something similar to what you're trying to do. Let me tell you what I've learned and how I did it.

I assume you have a one-to-one relationship between your Core Data object and the model (or db schema) on the server. You simply want to keep the server contents in sync with the clients, but clients can also modify and add data. If I got that right, then keep reading.

I added four fields to assist with synchronization:

  1. sync_status - Add this field to your core data model only. It's used by the app to determine if you have a pending change on the item. I use the following codes: 0 means no changes, 1 means it's queued to be synchronized to the server, and 2 means it's a temporary object and can be purged.
  2. is_deleted - Add this to the server and core data model. Delete event shouldn't actually delete a row from the database or from your client model because it leaves you with nothing to synchronize back. By having this simple boolean flag, you can set is_deleted to 1, synchronize it, and everyone will be happy. You must also modify the code on the server and client to query non deleted items with "is_deleted=0".
  3. last_modified - Add this to the server and core data model. This field should automatically be updated with the current date and time by the server whenever anything changes on that record. It should never be modified by the client.
  4. guid - Add a globally unique id (see http://en.wikipedia.org/wiki/Globally_unique_identifier) field to the server and core data model. This field becomes the primary key and becomes important when creating new records on the client. Normally your primary key is an incrementing integer on the server, but we have to keep in mind that content could be created offline and synchronized later. The GUID allows us to create a key while being offline.

On the client, add code to set sync_status to 1 on your model object whenever something changes and needs to be synchronized to the server. New model objects must generate a GUID.

Synchronization is a single request. The request contains:

  • The MAX last_modified time stamp of your model objects. This tells the server you only want changes after this time stamp.
  • A JSON array containing all items with sync_status=1.

The server gets the request and does this:

  • It takes the contents from the JSON array and modifies or adds the records it contains. The last_modified field is automatically updated.
  • The server returns a JSON array containing all objects with a last_modified time stamp greater than the time stamp sent in the request. This will include the objects it just received, which serves as an acknowledgment that the record was successfully synchronized to the server.

The app receives the response and does this:

  • It takes the contents from the JSON array and modifies or adds the records it contains. Each record get set a sync_status of 0.

I hope that helps. I used the word record and model interchangeably, but I think you get the idea. Good luck.

Solution 3

If you are still looking for a way to go, look into the Couchbase mobile. This basically does all you want. (http://www.couchbase.com/nosql-databases/couchbase-mobile)

Solution 4

Similar like @Cris I've implemented class for synchronization between client and server and solved all known problems so far (send/receive data to/from server, merge conflicts based on timestamps, removed duplicate entries in unreliable network conditions, synchronize nested data and files etc .. )

You just tell the class which entity and which columns should it sync and where is your server.

M3Synchronization * syncEntity = [[M3Synchronization alloc] initForClass: @"Car"
                                                              andContext: context
                                                            andServerUrl: kWebsiteUrl
                                             andServerReceiverScriptName: kServerReceiverScript
                                              andServerFetcherScriptName: kServerFetcherScript
                                                    ansSyncedTableFields:@[@"licenceNumber", @"manufacturer", @"model"]
                                                    andUniqueTableFields:@[@"licenceNumber"]];


syncEntity.delegate = self; // delegate should implement onComplete and onError methods
syncEntity.additionalPostParamsDictionary = ... // add some POST params to authenticate current user

[syncEntity sync];

You can find source, working example and more instructions here: github.com/knagode/M3Synchronization.

Solution 5

I think a good solution to the GUID issue is "distributed ID system". I'm not sure what the correct term is, but I think that's what MS SQL server docs used to call it (SQL uses/used this method for distributed/sync'ed databases). It's pretty simple:

The server assigns all IDs. Each time a sync is done, the first thing that is checked are "How many IDs do I have left on this client?" If the client is running low, it asks the server for a new block of IDs. The client then uses IDs in that range for new records. This works great for most needs, if you can assign a block large enough that it should "never" run out before the next sync, but not so large that the server runs out over time. If the client ever does run out, the handling can be pretty simple, just tell the user "sorry you cannot add more items until you sync"... if they are adding that many items, shouldn't they sync to avoid stale data issues anyway?

I think this is superior to using random GUIDs because random GUIDs are not 100% safe, and usually need to be much longer than a standard ID (128-bits vs 32-bits). You usually have indexes by ID and often keep ID numbers in memory, so it is important to keep them small.

Didn't really want to post as answer, but I don't know that anyone would see as a comment, and I think it's important to this topic and not included in other answers.

Share:
89,793

Related videos on Youtube

Jason
Author by

Jason

Updated on December 15, 2020

Comments

  • Jason
    Jason over 3 years

    I have been working on a method to sync core data stored in an iPhone application between multiple devices, such as an iPad or a Mac. There are not many (if any at all) sync frameworks for use with Core Data on iOS. However, I have been thinking about the following concept:

    1. A change is made to the local core data store, and the change is saved. (a) If the device is online, it tries to send the changeset to the server, including the device ID of the device which sent the changeset. (b) If the changeset does not reach the server, or if the device is not online, the app will add the change set to a queue to send when it does come online.
    2. The server, sitting in the cloud, merges the specific change sets it receives with its master database.
    3. After a change set (or a queue of change sets) is merged on the cloud server, the server pushes all of those change sets to the other devices registered with the server using some sort of polling system. (I thought to use Apple's Push services, but apparently according to the comments this is not a workable system.)

    Is there anything fancy that I need to be thinking about? I have looked at REST frameworks such as ObjectiveResource, Core Resource, and RestfulCoreData. Of course, these are all working with Ruby on Rails, which I am not tied to, but it's a place to start. The main requirements I have for my solution are:

    1. Any changes should be sent in the background without pausing the main thread.
    2. It should use as little bandwidth as possible.

    I have thought about a number of the challenges:

    1. Making sure that the object IDs for the different data stores on different devices are attached on the server. That is to say, I will have a table of object IDs and device IDs, which are tied via a reference to the object stored in the database. I will have a record (DatabaseId [unique to this table], ObjectId [unique to the item in the whole database], Datafield1, Datafield2), the ObjectId field will reference another table, AllObjects: (ObjectId, DeviceId, DeviceObjectId). Then, when the device pushes up a change set, it will pass along the device Id and the objectId from the core data object in the local data store. Then my cloud server will check against the objectId and device Id in the AllObjects table, and find the record to change in the initial table.
    2. All changes should be timestamped, so that they can be merged.
    3. The device will have to poll the server, without using up too much battery.
    4. The local devices will also need to update anything held in memory if/when changes are received from the server.

    Is there anything else I am missing here? What kinds of frameworks should I look at to make this possible?

    • Ole Begemann
      Ole Begemann about 13 years
      You cannot rely on Push Notifications being received. The user can simply tap them away and when a second notification arrives, the OS throws the first one away. IMO push notifications are a bad way to receive sync updates, anyway, because they interrupt the user. The app should initiate the sync whenever it is launched.
    • Jason
      Jason about 13 years
      OK. Thanks for the information - outside of constantly polling the server and checking for updates on launch, is there a way for the device to get updates? I am interested in making it work if the app is open on multiple devices simultaneously.
    • Dan2552
      Dan2552 over 11 years
      (I know a bit late, but incase anybody comes across this and also wonders) to keep multiple devices in sync simultaneously you could to keep an open connection with either the other device or a server, and send messages to tell the other device(s) when an update occurs. (e.g. the way IRC / instant messaging works)
    • johndodo
      johndodo almost 11 years
      @Dan2552: what you describe is known as [long polling][en.wikipedia.org/wiki/… and is a great idea, however open connections consume quite a lot of battery and bandwidth on a mobile device.
    • Frederic Yesid Peña Sánchez
      Frederic Yesid Peña Sánchez over 10 years
      Does the iOS push api allow to trigger app actions without being necessarily a notification for the user???
    • JRG-Developer
      JRG-Developer over 10 years
      Here's a good tutorial from Ray Wenderlich on how to sync data between your app and web service: raywenderlich.com/15916/…
  • chris
    chris about 13 years
    The last_modified field also exist in the local database, but it's not updated by the iPhone clock. It is set by the server, and synchronized back. The MAX(last_modified) date is what the app sends to the server to tell it to send back everything modified after that date.
  • Cragly
    Cragly almost 13 years
    When using GUID's as your PK did you not run into performance problems associated with using GUID's and PK's? I am looking to do the same thing as you have outlined but the GUID performance issue is putting me off slightly.
  • chris
    chris almost 13 years
    I haven't run into problems with performance (yet). The GUID fields are unique and indexed, so it should be fast.
  • Loyal Tingley
    Loyal Tingley over 12 years
    Is there a reason to do MAX(last_modified) instead of keeping a global value on the client-side? It seems like it is made redundant by sync_status.
  • chris
    chris over 12 years
    A global value on the client could replace MAX(last_modified), but that would be redundant since MAX(last_modified) suffices. The sync_status has another role. As I wrote earlier, MAX(last_modified) determines what needs to be sync'd from the server, while sync_status determines what needs to be sync'd to the server.
  • Jeremie Weldin
    Jeremie Weldin over 12 years
    Anyone have an updated link for the ZSync video? Also, is ZSync still maintained? I see it was last updated in 2010.
  • Jeremie Weldin
    Jeremie Weldin over 12 years
    This only does what you want if you can express your data as documents rather than relational data. There are work arounds, but they are not always pretty or worth it.
  • Perishable Dave
    Perishable Dave about 12 years
    ZSync's last commit on github was on September 2010 which leads me to believe Marcus stopped supporting it.
  • Greg Smalter
    Greg Smalter almost 12 years
    @chris I've posted a follow-up question based on your advice here: stackoverflow.com/q/10415289/34290. In essence, when you say "On the client, add code to set sync_status to 1 on your model object whenever something changes," how do you easily do that?
  • Lorenzo B
    Lorenzo B over 11 years
    @chris I really like your approach. I've already upvoted your question and I would put another +1 on it. A simple question. Are your additional fields (sync_status, last_updated, is_deleted, guid) replicated for each entity in the model? Obviously the only ones that need to sync with a service. Thanks.
  • chris
    chris over 11 years
    @Flex_Addicted Thanks. Yes, you would need to replicate the fields for each entity that you wish to synchronize. However, you need to take greater care when synchronizing a model with a relationship (e.g., 1-to-many).
  • Andrew Barber
    Andrew Barber over 11 years
    Welcome to Stack Overflow! Thanks for posting your answer! Please be sure to read the FAQ on Self-Promotion carefully.
  • Panagiotis Panagi
    Panagiotis Panagi over 11 years
    @chris BTW I have read a bunch of articles on the data sync topic (blog and journals) and your post is the only one that is easy to grasp and makes sense. Still having the above difficulty though.
  • Panagiotis Panagi
    Panagiotis Panagi over 11 years
    @chris it does make sense. Simple and elegant, thanks!
  • chris
    chris about 11 years
    @BenPackard - You are correct. The approach doesn't do any conflict resolution so the last client will win. I haven't had to deal with this in my apps since records are edited by a single user. I'd be curious to know how you resolve this.
  • user798719
    user798719 over 10 years
    I don't understand why the GUID has to be generated on the client. Could you explain what the problem is with waiting and letting the server generate the GUID?
  • omni
    omni about 10 years
    The algorithm described by Dan Grover is quite good. However, it will not work with a multi-threaded server code (thus: this won't scale at all) since there is no way to make sure a client won't miss an update when the time is used to check for new updates. Please, correct me if i'm wrong - i would kill to see a working implementation of this.
  • Massimo Cafaro
    Massimo Cafaro about 10 years
    @masi, it was not meant to work that way indeed. Making the implementation thread-safe would require mutex locks that would prevent scalability.
  • chris
    chris about 10 years
    Hi @Colin You could return just the guid, but why not return the entire object along with all the other items? It's less code that way and doesn't require any special handling. Your second question is valid, to which I don't have an answer. See here for a similar problem: stackoverflow.com/questions/21590273/…
  • MANN
    MANN almost 10 years
    4. If server has a higer C-Seq than client, It send his copy of DS. 4.a. Client shows to user which all field are changed. 4.b. User accept/update its version. If update, increase C-Seq and increase sync-status.
  • David Poxon
    David Poxon over 9 years
    NSManagedObjectContext contains accessor methods for all objects that have been inserted, updated or deleted since you last saved the context. Why not use this instead of sync_status?
  • chris
    chris over 9 years
    Hi @noilly, consider the following case: You make changes to a local object and need to synchronize it back to the server. The sync may only happen hours or days later (say if you've been offline for a while), and in that time the app may have been shutdown and restarted a few times. In this case the methods on NSManagedObjectContext wouldn't help much.
  • David Poxon
    David Poxon over 9 years
    @chris That makes a lot of sense - I hadn't considered that the user may not be online -_-. Thanks for the quick response! :)
  • chris
    chris over 9 years
    @noilly No problem. I wrote this originally for a travel blogging app, and it was quite common for users to be offline for multiple days in remote locations. I wonder if sync_status is the right choice of variable name, but it works. :)
  • Hai Feng Kao
    Hai Feng Kao over 9 years
    documents are enough for small applications
  • Charles0429
    Charles0429 about 9 years
    Hi, I can not access the PDF file that Dan Grover at iPhone 2009 conference shows, could you please send me a copy to [email protected] ? Thanks in advance
  • Massimo Cafaro
    Massimo Cafaro about 9 years
    Hi, I have just sent you the pdf file, as requested. Cheers, Massimo Cafaro
  • Charles0429
    Charles0429 about 9 years
    Thanks! The algorithm is quite good. But I think it requires the accurate timestamp (be same as the server) in each client, since each client need to store the modification time. Am I right?
  • Mick
    Mick about 9 years
    @cidered The video link in your comment above (Vimeo Video) is broken
  • Mick
    Mick about 9 years
    @MassimoCafaro The link is broken in your answer (link for Dan Grover at iPhone 2009 conference)
  • Mick
    Mick about 9 years
    @radiospiel Your link is broken
  • Charles0429
    Charles0429 about 9 years
    @Massimo Cafaro, in the step of INTERSECTION REVISITED (Page 57 in the PDF file), the timestamp of the server and the client are both used to compare if there is a conflict. If the timestamp of the server and the client are not consistent, I think there may be error in the code. Please correct me where am wrong. Thanks!
  • Massimo Cafaro
    Massimo Cafaro about 9 years
    @Patt, I have just sent you the pdf file, as requested. Cheers, Massimo Cafaro.
  • Charles0429
    Charles0429 about 9 years
    Thanks for your reply! If the timestamp difference between the client and the server is huge (for example, a month), what will happen? Will this algorithm still work correctly?
  • Matthew Kairys
    Matthew Kairys about 9 years
    The missing Cross-Platform Data Synchronization PDF slides by Dan Grover are accessible through the Wayback Machine.
  • NSTJ
    NSTJ about 9 years
  • chris
    chris almost 9 years
    @hardik.shah I would consider using an GUID in the event you have more than one client in the future (even multiple devices with the same user). It'll certainly be more future-proof that way.
  • thesummersign
    thesummersign almost 9 years
    This will also add a dependency that the backend need to be written in Couchbase DB. Even I started with the idea of NOSQL for synching but I cannot restrict my backend to be NOSQL as we have MS SQL running in backend.
  • radiospiel
    radiospiel almost 9 years
    @Mick: it seems to work again (or someone fixed the link? Thank you)
  • radiospiel
    radiospiel almost 9 years
    @thesummersign that might be the case. A better architecture would probably not use CouchDB as the principal storage, but, instead, as a communication vehicle - pushing documents there as the need arises.
  • Parth Pandya
    Parth Pandya over 8 years
    How this approach can work if I have to upload products records (having 4 images) to upload on server?
  • Golden
    Golden about 8 years
    Will it be ok if we change the device time to an abnormal value?
  • Harry Wang
    Harry Wang almost 8 years
    Do you have any advice on how to sync objects with relationships? Specifically, do you have the server/API send objects along with their relationships/associations as one nested json or two separate ones?
  • fujianjin6471
    fujianjin6471 over 7 years
    @chris It seems not convenient to get Max(last_modified) from all kinds of entities in Core Data, or maybe I misunderstand the concept of last_modified (every entity has a last_modified)?
  • leshow
    leshow almost 6 years
    @chris How do you prioritize the data sync? Always sync down from server first, then sync up to server after? or vice-versa? or async?
  • Fattie
    Fattie about 4 years
    this is like saying "just use fifebase or realm"