How to fix very high w_await on linux desktop?

5,334

In your case you say that a hard drive replacement fixed it. That's good. However, in my experience, albeit recently I've seen it only on laptops, this is most often a motherboard-level sata hardware problem, and in the case of laptops, I have never been able to repair it. Changing the drive out had no effect, and since I have only seen similar symptoms on laptops, I can only suggest that you try changing the drive and if that doesn't work, it's probably the motherboard.

Incidentally, I reinstalled OSs several times and thought I had worked around the glitches (which happen both in Windows and in every Linux version I had used) but they seemed to come back, after some periods of heavy use, leading me to think that there was a thermal component to the hardware/chipset-glitch.

(This is all assuming you didn't just switch kernels and there is therefore some glitch in the kernel drivers, but as you tried a variety of kernel levels, this corresponds quite clearly to my recent problems.)

Share:
5,334

Related videos on Youtube

Shai Berger
Author by

Shai Berger

Updated on September 18, 2022

Comments

  • Shai Berger
    Shai Berger over 1 year

    My Linux (Debian sid) desktop started to become sluggish in the last few weeks. When I investigated, I found that:

    1. There is no ram shortage -- the system regularly uses only half of its 4G, there is more than 1G free even when counting caches and buffers;
    2. The sluggishness is associated with file access; for example, opening a folder in KMail induces a mini-freeze;
    3. When it gets sluggish, the CPU is spending a lot of time in iowait.

    When I dug further, I found things like this:

    $ iostat -x -d /dev/sda
    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sda               0.05     7.90    3.14    2.41    23.27    40.94    23.11    12.02 2163.14   57.59 4906.16  31.58  17.55
    

    If I understand correctly, the value of w_await (almost 5000) is crazy high, the value of await (average of r_await and w_await?) is very high as a result, and otherwise things are normal.

    When I look at iotop at times of excessive sluggishness, I usually see all zeroes, with blinks of 99.9% iowait for kjournald, flush and sometimes the processes I expect (e.g. KMail).

    The system has been used as a "rolling distro" for several years, all filesystems are ext3.

    Oh, and of course: While swap is defined (on this disk, which is the only one constantly mounted in the system), it is hardly ever used (as I said, 4G are nowhere near exhausted).

    The only errors I've seen in dmesg are the cries of processes who have been blocked (at the peek of trouble -- in the first few minutes after reboot) for over 120 seconds. Mainly syslog. There seems to be no other indication of disk fault (smartctl says everything has always been ok, except for on time long ago when the disk airflow heated).

    I'm using linux 3.2; I've tried reverting all the way back to 2.6.38, to no avail.

    Is it the disk? Have the file-systems gone crazy? What more can I check?

    • David Schwartz
      David Schwartz about 12 years
      You spend a lot of time doing things other than explaining the problem, which only gets one sentence explaining that it "started to become sluggish in the last few weeks". Is it sluggish only with disk access? Did it gradually get more sluggish? Did you rule out things like a CPU fan failure or CPU overheating? Did you rule out RAM shortage? (What's the output of free?) It seems like you jumped to the conclusion that the disk is responsible with no justification mentioned in your question.
    • Jjames
      Jjames about 12 years
      Is it the disk? That's a question only you can answer. What does a fsck say? What's the smart state of the disk? Does that harddisk make some kind of noise?