How can I monitor the length of the accept queue?

17,014

Solution 1

To check if your queue is overflowing use either netstat or nstat

[centos ~]$ nstat -az | grep -i listen
TcpExtListenOverflows           3518352            0.0
TcpExtListenDrops               3518388            0.0
TcpExtTCPFastOpenListenOverflow 0  0.0

[centos ~]$ netstat -s | grep -i LISTEN
    3518352 times the listen queue of a socket overflowed
    3518388 SYNs to LISTEN sockets dropped

Reference: https://perfchron.com/2015/12/26/investigating-linux-network-issues-with-netstat-and-nstat/

To monitor your queue sizes, use the ss command and look for SYN-RECV sockets.

$ ss -n state syn-recv sport = :80 | wc -l
119

Reference: https://blog.cloudflare.com/syn-packet-handling-in-the-wild/

Solution 2

Sysdig will provide some of this information at the end of each accept syscall, as the queuelen argument. It also shows the length of the queue as queuemax.

7598971 21:05:30.322229280 1 gunicorn (6451) < accept fd=13(<4t>127.0.0.1:45882->127.0.0.1:8003) tuple=127.0.0.1:45882->127.0.0.1:8003 queuepct=0 queuelen=0 queuemax=10

As far as I'm aware, it provides no mechanism to know exactly when or how many times the queue has overflowed. And it would be cumbersome to integrate this with periodic monitoring by collectd or similar.

Solution 3

What you are looking for is the entry in output of sysctl -a command as such:

net.ipv4.tcp_max_syn_backlog = 4096

In the above example case, the backlog of SYN state connections is at most 4096. You can increase that based on how much RAM is in your server. I consider 32K worth of backlog to be a good start for tuning of heavily loaded web servers.

Also make sure the following is NOT set to 1:

net.ipv4.tcp_abort_on_overflow = 0

Otherwise it will definitely drop packets if there is a backlog overflow.

You can easily check values with sysctl -a | grep backlog or sysctl -a | grep overflow.

Additionally, you can find "dropped" label under the

ifconfig -a

command's output. That shows how many packets were dropped for each interface along with other data and errors etc.

For logging dropped packets there is a [paywalled] article for RHEL 7: https://access.redhat.com/solutions/1191593

For further research you may read http://veithen.io/2014/01/01/how-tcp-backlog-works-in-linux.html

It states here, as per Steven's Book Illustrated TCP/IP:

The queue limit applies to the sum of […] the number of entries on the incomplete connection queue […] and […] the number of entries on the completed connection queue […]."

It also states that:

The completed connection queue is almost always empty because when an entry is placed on this queue, the server’s call to accept returns, and the server takes the completed connection off the queue.

The accept queue may hence seem completely empty and you will have to tune your Web server to accept the connections placed on the "total aggregate" queue, faster.

Share:
17,014

Related videos on Youtube

Phil Frost
Author by

Phil Frost

Updated on September 18, 2022

Comments

  • Phil Frost
    Phil Frost almost 2 years

    I have a hypothesis: sometimes TCP connections arrive faster than my server can accept() them. They queue up until the queue overflows and then there are problems.

    How can I confirm this is happening?

    Can I monitor the length of the accept queue or the number of overflows? Is there a counter exposed somewhere?

    • Satō Katsura
      Satō Katsura over 7 years
      You're looking for netstat.
    • Phil Frost
      Phil Frost over 7 years
      As far as I can tell, netstat only shows the send and receive queue lengths, which is not the same as the accept queue.
    • Satō Katsura
      Satō Katsura over 7 years
      Right, looking at the sources it seems those flags are for UNIX sockets. For TCP you could just count SYN_RECV though. There is no other queue beyond that. I suppose the kernel can be told somehow to log dropped packets because of too many half-open connections, but there have been some 10+ years since I looked at networking with Linux, so I have no idea how to do that. On a side note: you aren't waiting for accept() to do its job, you're waiting for ACKs to arrive from the connecting hosts to complete the connections.
  • Scott - Слава Україні
    Scott - Слава Україні about 5 years
    While there seems to be some useful information here, I’m not sure it answers the question.  If I ask, “What’s the most number of people that have ever been in this auditorium at one time?”, and you point to a sign on the wall that gives the maximum capacity, you haven’t answered the question.
  • Phil Frost
    Phil Frost about 5 years
    Indeed I'm looking for the current length of the queue, not the maximum length of the queue.
  • DevilaN
    DevilaN over 4 years
    It should be tcp_max_syn_backlog, not tcp_max_SYNC_backlog as in your answer
  • Aaron C. de Bruyn
    Aaron C. de Bruyn about 4 years
    Yeah...and StackOverflow gives you a retarded error message when you try to change it: "Edits must be at least 6 characters; is there something else to improve in this post?"