(How) Can I reduce socket latency?

19,614

Solution 1

You can't really do meaningful performance measurements on a proxy with the client, proxy and origin server on the same host.

Place them all on different hosts on a network. Use real hardware machines for them all, or specialised hardware test systems (e.g. Spirent).

Your methodology makes no sense. Nobody has 600us of latency to their origin server in practice anyway. Running all the tasks on the same host creates contention and a wholly unreaslistic network environment.

Solution 2

40ms is the TCP ACK delay on Linux, which indicates that you are likely encountering a bad interaction between delayed acks and the Nagle algorithm. The best way to address this is to send all of your data using a single call to send() or sendmsg(), before waiting for a response. If that is not possible then certain TCP socket options including TCP_QUICKACK (on the receiving side), TCP_CORK (sending side), and TCP_NODELAY (sending side) can help, but can also hurt if used improperly. TCP_NODELAY simply disables the Nagle algorithm and is a one-time setting on the socket, whereas the other two must be set at the appropriate times during the life of the connection and can therefore be trickier to use.

Solution 3

INTRODUCTION:

I already praised mark4o for the truly correct answer to the general question of lowering latency. I would like to translate the answer in terms of how it helped solve my latency issue because I think it's going to be the answer most people come here looking for.

ANSWER:

In a real-time network app (such as a multiplayer game) where getting short messages between nodes as quickly as possible is critical, TURN NAGLE OFF. In most cases this means setting the "no-delay" flag to true.

DISCLAIMER:

While this may not solve the OP specific problem, most people who come here will probably be looking for this answer to the general question of their latency issues.

ANECDOTAL BACK-STORY:

My game was doing fine until I added code to send two messages separately, but they were very close to each other in execution time. Suddenly, I was getting 250ms extra latency. As this was a part of a larger code change, I spent two days trying to figure out what my problem was. When I combined the two messages into one, the problem went away. Logic led me to mark4o's post and so I set the .Net socket member "NoDelay" to true, and I can send as many messages in a row as I want.

Solution 4

From e.g. the RedHat documentation:

Applications that require lower latency on every packet sent should be run on sockets with TCP_NODELAY enabled. It can be enabled through the setsockopt command with the sockets API:

int one = 1;
setsockopt(descriptor, SOL_TCP, TCP_NODELAY, &one, sizeof(one));

For this to be used effectively, applications must avoid doing small, logically related buffer writes. Because TCP_NODELAY is enabled, these small writes will make TCP send these multiple buffers as individual packets, which can result in poor overall performance.

Solution 5

In your case, that 40ms is probably just a scheduler time quantum. In other words, that's how long it takes your system to get back round to the other tasks. Try it on a real network, you'll get a completely different picture. If you have a multi-core machine, using virtual OS instances in Virtualbox or some other VM would give you a much better idea of what is really going to happen.

Share:
19,614
Pepper
Author by

Pepper

Updated on August 17, 2022

Comments

  • Pepper
    Pepper over 1 year

    I have written an HTTP proxy that does some stuff that's not relevant here, but it is increasing the client's time-to-serve by a huge amount (600us without proxy vs 60000us with it). I think I have found where the bulk of that time is coming from - between my proxy finishing sending back to the client and the client finishing receiving it. For now, server, proxy and client are running on the same host, using localhost as the addresses.

    Once the proxy has finished sending (once it has returned from send() at least), I print the result of gettimeofday which gives an absolute time. When my client has received, it prints the result of gettimeofday. Since they're both on the same host, this should be accurate. All send() calls are with no flags, so they are blocking. The difference between the two is about 40000us.

    The proxy's socket on which it listens for client connections is set up with the hints AF_UNSPEC, SOCK_STREAM and AI_PASSIVE. Presumably a socket from accept()ing on that will have the same parameters?

    If I'm understanding all this correctly, Apache manages to do everything in 600us (including the equivalent of whatever is causing this 40000us delay). Can anybody suggest what might be causing this? I have tried setting the TCP_NODELAY option (I know I shouldn't, it's just to see if it made a difference) and the delay between finishing sending and finishing receiving went right down, I forget the number but <1000us.

    This is all on Ubuntu Linux 2.6.31-19. Thanks for any help

  • Pepper
    Pepper about 14 years
    Many thanks for your answer. I didn't think it would make so much difference. I've done what you suggested and ran each on a different host and the time-to-serves are now negligibly different with vs. without proxy.
  • Pepper
    Pepper about 14 years
    Thanks, I've done what you suggested (real network) and am now getting expected results.
  • Jeremy Friesner
    Jeremy Friesner about 14 years
    As an aside, it is possible to have your cake and eat it too in this case: you can leave Nagle's enabled on your socket, and then when you've written all the data to the socket that you're likely to write for a while, disable Nagle's, send() 0 bytes on the socket, then re-enable Nagle's again. The send() will force any pending data to be sent immediately.
  • Jin
    Jin over 12 years
    THIS answer is the best one to the question of "How do I reduce socket latency?" -- which IS the question. Don't uprate the answer that solves OP's specific problem, that's not what makes this site so powerful. This site is great because we can find solutions to OUR problems quickly. So if you're a real StackOverflow'er, this kind of answer is the one you rate up (at least IMHO lol).
  • nh2
    nh2 about 9 years
    "Nobody has 600us of latency to their origin server" - is this really true? Normal Ethernet has around 0.5ms latency (tried with ping).
  • Paul Draper
    Paul Draper almost 9 years
    @nh2, right, so start there and then add the latency of most servers, and I think you've got 1ms+.
  • dsign
    dsign almost 9 years
    -1: You can certainly learn a lot about latency with just localhost, like for example that there is a Nagle algorithm and a 40 ms delay with very short packets.
  • Adam
    Adam about 6 years
    Thanks! Setting TCP_NODELAY with setsockopt() fixed my problem.