finding out what causes high sendmail load average

14,889

You should issue a mailq and see how many mails are there in the queue. I am guessing there may be a lot, especially if this is a web server that runs a PHP application that sends emails (and gets abused by spammers).

First try and reverse the default sendmail load management configuration:

define(confREFUSE_LA, 8)dnl
define(confQUEUE_LA, 12)dnl

Put these in your sendmail.mc and generate sendmail.cf and restart sendmail according to your operating system's directions.

By inspecting the output of mailq locate the queue files and inspect the contents of the mails in the queue. This will give you an idea of who is sending these emails and why. That is, if too much email is the cause of your problems.

Other causes might be an abusive (ill configured) mail client (be it POP3 or IMAP) or something else that causes high load.

Depending on your progress you may need monitoring in front of the machine to see (and analyse) what comes in and what goes out.

Again, depending your findings, you may need to ask this question again over at security.stackexchange.com.

EDIT: You should keep in mind that sendmail starts rejecting requests to handle email when the machine's load increases over a certain threshold. For that threshold to be reached sendmail may not be responsible. Other processes, like the web server, a cronjob, a query on an ill configured MySQL that causes the machine to swap may take the blame.

Share:
14,889

Related videos on Youtube

Robin Manoli
Author by

Robin Manoli

Web developer and strategist

Updated on September 18, 2022

Comments

  • Robin Manoli
    Robin Manoli almost 2 years

    I'm investigating a server which I have not set up, for which there is no technician to answer questions.

    The problem is that the load average is high, which makes the server unable to send emails! It happens periodically, sometimes several times per minute, and the load average can be as high as 80!

    I noticed that sometimes it might take an hour before an email can actually be sent, and I would like to understand better what is going on on the server.

    Periodically (sometimes several times per minute, and the load average can be as high as 80) I get messages like this in the mail log:

    Feb  9 01:37:54 mydomain sm-mta[999]: rejecting connections on daemon MTA-v4: load average: 48
    

    I don't know the cause of this, but it seems like emails are not actually being sent, so I wonder what could be possibly going on.

    Occasionally, emails actually seem to be sent. The only thing I know is sending emails is the web server, so the e-mails sent from www-data make sense. I don't know what could be sending those.

    Feb  9 01:54:22 mydomain sendmail[6704]: r1...: from=www-data, size=1380, class=0, nrcpts=1, msgid=<[email protected]>, relay=www-data@localhost
    Feb  9 01:54:23 mydomain sm-mta[6706]: r1... from=<[email protected]>, size=1482, class=0, nrcpts=1, msgid=<[email protected]>, proto=ESMTP, daemon=MTA-v4, relay=localhost [127.0.0.1]
    Feb  9 02:01:02 mydomain sendmail[6751]: r1...: from=root, size=323, class=0, nrcpts=1, msgid=<[email protected]>, relay=root@localhost
    Feb  9 02:01:02 mydomain sm-mta[6752]: r1...: from=<[email protected]>, size=597, class=0, nrcpts=1, msgid=<[email protected]>, proto=ESMTP, daemon=MTA-v4, relay=localhost [127.0.0.1]
    

    netstat -ntop shows only apache2 processes.

    What could be some ways to tackle this problem?

    • jojoo
      jojoo over 11 years
      is the spike in the load caused by sendmail or by another process or daemon running on the server? with a load of 80 you should not debug sendmain, you should find the cause for the high load. when you have a normal load you can start debugging sendmail. use top as a starter to debug the high load
    • Robin Manoli
      Robin Manoli over 11 years
      using top there is basically no activity at all... mysql is using less than 5% cpu... yet, at the same time this process is going on: sendmail: MTA: rejecting connections on daemon MSP-v4: load average: 49
    • Jure1873
      Jure1873 over 11 years
      You could try iotop to check if it's related to I/O.
    • Brotsky Engineer
      Brotsky Engineer over 11 years
      Sendmail rejecting emails because the load is too high does not mean that sendmail is causing the load. It is far more likely that the load is being cause by something else. Sendmail is merely responding to this high load and refusing to contribute further to it. Why don't you post the output of top sorted by CPU usage?
    • Robin Manoli
      Robin Manoli over 11 years
      @drone.ah now when load average is about 80, mysqld has 5-15% cpu usage... other than that there is only apache, ssh and top going on
    • Brotsky Engineer
      Brotsky Engineer over 11 years
      There may be only apache but how many apache processes are there? If there are a lot of apache processes, you will find that is more likely the culprit. Also, how much time is spent in iowait? A top output would answer some of these questions..
    • Robin Manoli
      Robin Manoli over 11 years
      well, the situation seems to be that mysql is the culprit... but i don't know if it's because someone is at the website, or if there could be some backup process in the background somewhere... anyway, it seems pointless to refuse mail handling when the cpu is only using around 10% cpu power, so i decided to change the sendmail config files instead
  • Robin Manoli
    Robin Manoli over 11 years
    there are only 15 mails (/var/spool/mqueue-client (10 requests)+/var/spool/mqueue (5 requests)), at the same time as i got this process running: sendmail: MTA: rejecting connections on daemon MSP-v4: load average: 50
  • Robin Manoli
    Robin Manoli over 11 years
    the really strange thing is that the load average doesn't seem to be mails that are actually being sent... is it possible that spammers try to use the smtp server from the outside?
  • adamo
    adamo over 11 years
    You should monitor your mail.log and see whether you get that much incoming connections. But since the machine is running a web server, maybe you should have a look at the web server's access_log too.