Proxying WebSockets with TCP load balancer without sticky sessions

16,050

Solution 1

I think what we need to understand in order to answer this question is how exactly the underlying TCP connection evolves during the whole WebSocket creation process. You will realize that the sticky part of a WebSocket connection is the underlying TCP connection itself. I am not sure what you mean with "session" in the context of WebSockets.

At a high level, initiating a "WebSocket connection" requires the client to send an HTTP GET request to an HTTP server whereas the request includes the Upgrade header field. Now, for this request to happen the client needs to have established a TCP connection to the HTTP server (that might be obvious, but I think here it is important to point this out explicitly). The subsequent HTTP server response is then sent through the same TCP connection.

Note that now, after the server response has been sent, the TCP connection is still open/alive if not actively closed by either the client or the server.

Now, according to RFC 6455, the WebSocket standard, at the end of section 4.1:

If the server's response is validated as provided for above, it is
said that The WebSocket Connection is Established and that the
WebSocket Connection is in the OPEN state

I read from here that the same TCP connection that was initiated by the client before sending the initial HTTP GET (Upgrade) request will just be left open and will from now on serve as the transport layer for the full-duplex WebSocket connection. And this makes sense!

With respect to your question this means that a load balancer will only play a role before the initial HTTP GET (Upgrade) request is made, i.e. before the one and only TCP connection involved in said WebSocket connection creation is established between the two communication end points. Thereafter, the TCP connection stays established and cannot become "redirected" by a network device in between.

We can conclude that -- in your session terminology -- the TCP connection defines the session. As long as a WebSocket connection is alive (i.e. is not terminated), it by definition provides and lives in its own session. Nothing can change this session. Speaking in this picture, two independent WebSocket connections, however, cannot share the same session.

If you referred to something else with "session", then it probably is a session that is introduced by the application layer and we cannot comment on that one.

Edit with respect to your comments:

so you're saying that the load balancer is not involved in the TCP connection

No, that is not true, at least in general. It definitely can take influence upon TCP connection establishment, in the sense that it can decide what to do with the client connection attempt. The specifics depend on the exact type of load balancer (* , see below). Important: After the connection is established between two endpoints -- whereas I don't consider the load balancer to be an endpoint, I refer to WebSocket client and WebSocket server -- the two endpoints will not change anymore for the lifetime of the WebSocket connection. The load balancer might* still be in the network path, but can be assumed to not take influence anymore.

Therefore the full-duplex connection is between the client and the end server?

Yes!

***There are different types of load balancing. Depending on the type, the role of the load balancer is different after connection establishment between the two end points. Examples:

  • If the load balancing happens on DNS basis, then the load balancer is not involved in the final TCP connection at all. It just tells the client to which host is has to connect directly.
  • If the load balancer works like the Layer 4 ELB from AWS (docs here), then it so to say proxies the TCP connection. So the client would actually see the ELB itself as the server. What happens, however, is that the ELB just forwards the packages in both directions, without change. Hence, it is still heavily involved in the TCP connection, just transparently. In this case there are actually two permanent TCP connections involved: one from you to the ELB, and one from the ELB to the server. These are again permanent for the lifetime of your WebSocket connection.

Solution 2

WebSocket uses a persistent TCP connection, and hence requires all IP packets for that TCP connection to be forwarded to the same backend server (for the lifetime of the TCP connection).

It needs to be sticky. This is different from L7 HTTP LBs which are able to dispatch on a per HTTP-request basis.

A LB can work sticky by different approaches, i.e.

  • hash the source IP/port to the set of alive backend servers
  • upon TCP connection establishment, choose a backend server and remember that
Share:
16,050
Justin Meltzer
Author by

Justin Meltzer

Updated on June 19, 2022

Comments

  • Justin Meltzer
    Justin Meltzer almost 2 years

    I want to proxy WebSocket connections to multiple node.js servers using Amazon Elastic Load Balancer. Since Amazon ELB does not provide actual WebSocket support, I would need to use its vanilla TCP messaging. However, I'm trying to understand how this would work without some sort of sticky session functionality.

    I understand that WebSockets work by first sending an HTTP Upgrade request from the client, which is handled by the server by sending a response which correctly handles key authentication. After the server sends that response and it is approved by the client, there is a bidirectional connection between that client and server.

    However let's say the client, after approving the server response, sends data to the server. If it sends the data to the load balancer, and the load balancer then relays that data to a different server that did not handle the original WebSocket Upgrade request, then how will this new server be aware of the WebSocket connection? Or will the client automatically bypass the load balancer and send data directly to the server that handled the initial upgrade?

  • Justin Meltzer
    Justin Meltzer about 11 years
    so you're saying that the load balancer is not involved in the TCP connection, and that only the end server handling the UPGRADE response is involved in the TCP connection. Therefore the full-duplex connection is between the client and the end server?
  • Justin Meltzer
    Justin Meltzer about 11 years
    And by session I simply mean maintaining state of the load balancer with respect to the identity of the client and which server it initially sent the HTTP UPGRADE request to.
  • Justin Meltzer
    Justin Meltzer about 11 years
    But in the case of ELB, where the load balancer sits between the client and the server endpoints, how does the load balancer know to send subsequent data from the same client to the same server that initially received the HTTP Upgrade request? This is my primary question, which does not seem to have been answered yet.
  • Dr. Jan-Philip Gehrcke
    Dr. Jan-Philip Gehrcke about 11 years
    Justin, it is. Once the TCP connection is established in case of the AWS Layer 4 ELB, the ELB is a transparent part of the connection. There is a static TCP connection between client and ELB, and a static connection between ELB and server. See the last bullet point in my answer.
  • Justin Meltzer
    Justin Meltzer about 11 years
    Ok, so you're saying that if the ELB receives packets from client A over TCP, it will automatically forward it to server B over TCP, provided that server B was the server that received the original HTTP Upgrade? How does the ELB know to do this if these are two separate TCP connections?
  • Dr. Jan-Philip Gehrcke
    Dr. Jan-Philip Gehrcke about 11 years
    Okay, I'll try it in other words :-) 1) The AWS Layer 4 ELB proxies incoming new TCP connections to a backend server of its choice. 2) From this moment on and for the lifetime of this TCP connection, the ELB maintains a quasi-direct connection between client and backend server. 3) "quasi-direct" means that the connection is proxied through the ELB. So while the TCP connection is alive, the ELB is fully aware of the two end points taking part (client and backend server).
  • Dr. Jan-Philip Gehrcke
    Dr. Jan-Philip Gehrcke about 11 years
    @JustinMeltzer: you might want to realize the difference between stateful and stateless protocols. While HTTP is a stateless protocol by itself, your question makes a lot of sense -- load balancing for generally independent HTTP requests is a challenge. Keeping the correspondence between one client and one backend server in case of HTTP requires tricks. A TCP connection, however, is stateful and permanent. Keeping the correspondence between client and backend server in case of a TCP connection already happens by definition. If this principle is violated, it's not a TCP connection anymore.
  • Justin Meltzer
    Justin Meltzer about 11 years
    Ok got it. So how exactly is the TCP connection stateful? And how exactly does ELB know which TCP connection to forward the packets to after receiving it from the first TCP connection? Just curious as to the mechanics behind this haha.
  • Dr. Jan-Philip Gehrcke
    Dr. Jan-Philip Gehrcke about 11 years
    ad 1) "stateful" and "stateless" are concepts applicable in many situations. You can read a lot about it. Regarding protocols, you could start here: en.wikipedia.org/wiki/Stateless_protocol -- ad 2) The ELB must implement a mechanism that directly connects client<-->ELB and ELB<-->backend. For simplicity, you could think of it like a lookup table.
  • smwikipedia
    smwikipedia about 9 years
    @Jan-PhilipGehrcke Nice answer! Learn a lot! BTW I come from here: stackoverflow.com/questions/28516962/…
  • Jake Hoffner
    Jake Hoffner over 8 years
    Just to throw in a point here in case its being lost. Once the handshake is made then the TCP connection is kept alive and the AWS ELB is not an issue. However during the negotiation process two requests are made, and if the 2nd request is not routed to the same endpoint then the negotiation fails.
  • Dr. Jan-Philip Gehrcke
    Dr. Jan-Philip Gehrcke over 8 years
    Hey Jake. Can you add a reference about what exactly you mean with "during the negotiation process two requests are made"?
  • Vinay
    Vinay over 5 years
    @Jan-PhilipGehrcke thanks for your detailed answer, it helped a lot. But the query that remains for me is that as you said the ELB maintains a quasi-direct connection to client and server endpoint so does this connection take up ELB file descriptors and ephemeral ports ?