How to combine websockets and http to create a REST API that keeps data up to date?

19,565

Solution 1

Idea B is for me the best, because the client specifically subscribes for changes in a resource, and gets the incremental updates from that moment.

Do we even need to use REST or is ws enought for all data?

Please check: WebSocket/REST: Client connections?

Solution 2

I don't know Java, but I worked with both Ruby and C on these designs...

Funny enough, I think the easiest solution is to use JSON, where the REST API simply adds the method data (i.e. method: "POST") to the JSON and forwards the request to the same handler the Websocket uses.

The underlying API's response (the response from the API handling JSON requests) can be translated to any format you need, such as HTML rendering... though I would consider simply returning JSON for most use cases.

This helps encapsulate the code and keep it DRY while accessing the same API using both REST and Websockets.

As you might infer, this design makes testing easier, since the underlying API that handles the JSON can be tested locally without the need to emulate a server.

Good Luck!

P.S. (Pub/Sub)

As for the Pub/Sub, I find it best to have a "hook" for any update API calls (a callback) and a separate Pub/Sub module that handles these things.

I also find it more resource friendly to write the whole data to the Pub/Sub service (option B) instead of just a reference number (option C) or an "update available" message (options A and D).

In general, I also believe that sending the whole user list isn't effective for larger systems. Unless you have 10-15 users, the database call might be a bust. Consider the Amazon admin calling for a list of all users... Brrr....

Instead, I would consider dividing this to pages, say 10-50 users a page. These tables can be filled using multiple requests (Websocket / REST, doesn't matter) and easily updated using live Pub/Sub messages or reloaded if a connection was lost and reestablished.

EDIT (REST vs. Websockets)

As For REST vs. Websockets... I find the question of need is mostly a subset of the question "who's the client?"...

However, once the logic is separated from the transport layer, than supporting both is very easy and often it makes more sense to support both.

I should note that Websockets often have a slight edge when it comes to authentication (credentials are exchanged once per connection instead of once per request). I don't know if this is a concern.

For the same reason (as well as others), Websockets usually have an edge with regards to performance... how big an edge over REST depends on the REST transport layer (HTTP/1.1, HTTP/2, etc').

Usually these things are negligible when it comes time to offer a public API access point and I believe implementing both is probably the way to go for now.

Solution 3

To summarize your ideas:

A: Send a message to all clients when a user edits data on the server. All users then request an update of all data.
-This system may make a lot of unnecessary server calls on behalf of clients who are not using the data. I don't recommend producing all of that extra traffic as processing and sending those updates could become costly.

B: After a user pulls data from the server, they then subscribe to updates from the server which sends them information about what has changed.
-This saves a lot of server traffic, but if you ever get out of sync, you're going to be posting incorrect data to your users.

C: Users who subscribe to data updates are sent information about which data has been updated, then fetch it again themselves.
-This is the worst of A and B in that you'll have extra round trips between your users and servers just to notify them that they need to make a request for information which may be out of sync.

D: Users who subscribe to updates are notified when any changes are made and then request the last change made to the server.
-This presents all of the problems with C, but includes the possibility that, once out of sync, you may send data that will be nonsense to your users which might just crash the client side app for all we know.

I think that this option E would be best:
Every time data changes on the server, send the contents of all the data to the clients who have subscribed to it. This limits the traffic between your users and the server while also giving them the least chance of having out of sync data. They might get stale data if their connection drops, but at least you wouldn't be sending them something like Delete entry 4 when you aren't sure whether or not they got the message that entry 5 just moved into slot 4.

Some Considerations:

  • How often does the data get updated?
  • How many users need to be updated each time an update occurs?
  • What are your transmission costs? If you have users on mobile devices with slow connections, that will affect how often and how much you can afford to send to them.
  • How much data gets updated in a given update?
  • What happens if a user sees stale data?
  • What happens if a user gets data out of sync?

Your worst case scenario would be something like this: Lots of users, with slow connections who are frequently updating large amounts of data that should never be stale and, if it gets out of sync, becomes misleading.

Solution 4

The answer depends on your use case. For the most part though I've found that you can implement everything you need with sockets. As long as you are only trying to access your server with clients who can support sockets. Also, scale can be an issue when you're using only sockets. Here are some examples of how you could use just sockets.

Server side:

socket.on('getUsers', () => {
    // Get users from db or data model (save as user_list).
    socket.emit('users', user_list );
})
socket.on('createUser', (user_info) => {
    // Create user in db or data model (save created user as user_data).
    io.sockets.emit('newUser', user_data);
})

Client side:

socket.on('newUser', () => {
    // Get users from db or data model (save as user_list).
    socket.emit('getUsers');
})
socket.on('users', (users) => {       
    // Do something with users
})

This uses socket.io for node. I'm not sure what your exact scenario is but this would work for that case. If you need to include REST endpoints that would be fine too.

Solution 5

I personally have used Idea B in production and am very satisfied with the results. We use http://www.axonframework.org/, so every change or creation of an entity is published as an event throughout the application. These events are then used to update several read models, which are basically simple Mysql tables backing one or more queries. I added some interceptors to the event processors that update these read models so that they publish the events they just processed after the data is committed to the DB.

Publishing of events is done through STOMP over web sockets. It is made very simple is you use Spring's Web Socket support (https://docs.spring.io/spring/docs/current/spring-framework-reference/html/websocket.html). This is how I wrote it:

@Override
protected void dispatch(Object serializedEvent, String topic, Class eventClass) {
    Map<String, Object> headers = new HashMap<>();
    headers.put("eventType", eventClass.getName());
    messagingTemplate.convertAndSend("/topic" + topic, serializedEvent, headers);
}

I wrote a little configurer that uses Springs bean factory API so that I can annotate my Axon event handlers like this:

@PublishToTopics({
    @PublishToTopic(value = "/salary-table/{agreementId}/{salaryTableId}", eventClass = SalaryTableChanged.class),
    @PublishToTopic(
            value = "/salary-table-replacement/{agreementId}/{activatedTable}/{deactivatedTable}",
            eventClass = ActiveSalaryTableReplaced.class
    )
})

Of course, that is just one way to do it. Connecting on the client side may look something like this:

var connectedClient = $.Deferred();

function initialize() {
    var basePath = ApplicationContext.cataDirectBaseUrl().replace(/^https/, 'wss');
    var accessToken = ApplicationContext.accessToken();
    var socket = new WebSocket(basePath + '/wss/query-events?access_token=' + accessToken);
    var stompClient = Stomp.over(socket);

    stompClient.connect({}, function () {
        connectedClient.resolve(stompClient);
    });
}


this.subscribe = function (topic, callBack) {
    connectedClient.then(function (stompClient) {
        stompClient.subscribe('/topic' + topic, function (frame) {
            callBack(frame.headers.eventType, JSON.parse(frame.body));
        });
    });
};

initialize();
Share:
19,565
David Berg
Author by

David Berg

I spend my days programming in different projects as an IT consultant. I am most familar with Java but I think it is fun to play around with different languages. Angular is a framework I really like for the frontend, it made me not hate javascript. In the evenings I spend time with friends or in front of my computer at home. Either playing games or programming on one of my thousand unfinished projects. I use to come up with new ideas that feels more fun before I ever finish anything (the journey is the goal, right?). Other interests is politics, music and to ask people the question "Why?" as many times as I can.

Updated on June 09, 2022

Comments

  • David Berg
    David Berg almost 2 years

    I am thinking about buildning a REST API with both websockets and http where I use websockets to tell the client that new data is available or provide the new data to the client directly.

    Here are some different ideas of how it could work:
    ws = websocket

    Idea A:

    1. David get all users with GET /users
    2. Jacob add a user with POST /users
    3. A ws message is sent to all clients with info that a new user exist
    4. David recive a message by ws and calls GET /users

    Idea B:

    1. David get all users with GET /users
    2. David register to get ws updates when a change is done to /users
    3. Jacob add a user with POST /users
    4. The new user is sent to David by ws

    Idea C:

    1. David get all users with GET /users
    2. David register to get ws updates when a change is done to /users
    3. Jacob add a user with POST /users and it gets the id 4
    4. David receive the id 4 of the new user by ws
    5. David get the new user with GET /users/4

    Idea D:

    1. David get all users with GET /users
    2. David register to get ws updates when changes is done to /users.
    3. Jacob add a user with POST /users
    4. David receive a ws message that changes is done to /users
    5. David get only the delta by calling GET /users?lastcall='time of step one'

    Which alternative is the best and what are the pros and cons?
    Is it another better 'Idea E'?
    Do we even need to use REST or is ws enought for all data?

    Edit
    To solve problems with data getting out of sync we could provide the header
    "If-Unmodified-Since"
    https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Unmodified-Since
    or "E-Tag"
    https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag
    or both with PUT requests.

  • David Berg
    David Berg over 8 years
    Thanks for your answer! I also think I prefer Idea B, but got a new thought about the post. Would it be a good idea to also use ws for saving entities?
  • vtortola
    vtortola over 8 years
    That is what I try to answer in that link I included. WS are bidirectional so technically, it is possible. However, when you need to scale your solution you will want to separate writes from reads, and if you are doing everything through the same websocket, it will be not possible.
  • David Berg
    David Berg almost 7 years
    Thanks for your response. I think your suggestion for option E is what I thought with option B. I am sorry if the example was not good enough. Also about your concern that the data will be out of sync, this can be solved by providing a header "If-Not-Modified-Since" with the post call. developer.mozilla.org/en-US/docs/Web/HTTP/Headers/…
  • Glen Pierce
    Glen Pierce almost 7 years
    How will be be certain that your server's time and your user's time are the same? If you ever start using microservices, how will you be sure your servers' times are in sync? Troubleshooting error conditions on distributed systems with logging timestamps out-of-sync is no fun.
  • David Berg
    David Berg almost 7 years
    You can store a last edited date on the resources on the server which you provide to the client on get requests, then you use this date when sending the "If-Unmodified-Since" header.
  • David Berg
    David Berg almost 7 years
    This however will not ensure that the servers times are in sync, but I think you should have tools in a distributed system to make sure of that.
  • David Berg
    David Berg almost 7 years
    I read your edit. If the worst scenario is a lot of users with slow connection, wouldn't it be best to send as little as possible and only send the delta and maybe a checksum for the entire object. That way the client can see if the data is out of sync and request a complete set only if it happens. The more I think of this I tend to prefer some version of the "D" alternative or to even go for a full ws implementation.
  • David Berg
    David Berg almost 7 years
    I think Firebase use websockets for its communication, at least for webpages. Another similar product I have tested is Deepstream which is a good alternative if you would like to go for a full websocket solution. (deepstream.io)
  • David Berg
    David Berg almost 7 years
    What makes the access control of alternative A easier than C or D where a GET also is used to fetch data?