/proc/sys/kernel/hung_task_timeout_secs error

25,004

You kernel has enabled this task block detecting feature. In most cases, this should be fine since it's not real block. You can just use the command suggested above to disable this.

I've met similar problem before. In kernel, if you use down(semaphore) instead of down_interruptable(semaphore) to wait a semaphore and this semaphore is not up in 120 seconds. It will have that INFO printed. Actually it is fine to have semaphore not released in 120 seconds.

If things work fine except this INFO, just disable this by echo 0.

Share:
25,004

Related videos on Youtube

Balasarius
Author by

Balasarius

Updated on September 18, 2022

Comments

  • Balasarius
    Balasarius over 1 year

    I am building a CentOS 6.2 webserver on a Hyper-V Cluster and have hit a major issue. The current setup is this:

    Operating System: CentOS 6.2. Minimal Server Install Installed software includes:

    • Hyper-V Linux integration Drivers
    • Webmin
    • Apache 2
    • MySQL 5.*
    • php 5 with APC php accelerator
    • webserver is running Moodle (if it making any difference mentioning it)

    The last thing I installed was the "Hyper-V Linux integration Drivers" after which ... a few days later the entire OS locked up with "task blocked for more than 120 seconds" error right down the console. I now find this error rears it's ugly head over a prolonged running time and be triggered via a yum update command flagging right after the second download.

    Picture of terminal

    I've tried fsck as suggested on other forums but to no avail. Most other forums indicate hardware issues as the cause which can not fit here with the Hyper-V cluster (currently running a dozen high demand services with no issues)

    EDIT: (should have mentioned this first time around) this error is the last thing I see prior to the whole Kernel locking up. after this the only way to fix it, is to hard re-boot the virtual machine.

    Any help solving or diagnosing the problem is much appreciated.

    Thank You (in advance)

    UPDATE 1:

    OK serious escalation here. I've now got 2 other Linux machine of the same cluster with related issue and all of them fall over at the same time. I have a Ubuntu server complaining of a status: {DRDY} and a second CentOS 6.2 server complain of the same error as the first and the first server has crashed in a new way...

    Ubuntu DRDY Error Ubuntu DRDY Error

    The First CentOS Kernel Crash CentOS Kernel Crash

    UPDATE 2: OK we got the cluster fixed but the problem still persisted with the two CentOS VMs. I have fix the first problem centOS by moving it the a un-clusterd Hyper-V and blacklisting all the Linux Integration Drivers followed by completely uninstalling them. it appears that the Linux integration drivers for the guest Machine along a the issues with clustered system causes this issue. the second CentOS machine is not so lucky... even though I have moved it to the un-clustered Hyper-V and blacklisted the drivers it still crashed unfortunately right in the middle of the uninstall of the linux integration driver. now I have the very big issue of the driver package showing in the rpm DB and only being half there.

    Question: is there a way of completely removing these drivers without using rpm or yum as now both complain it's there and not there and fails to remove it and trying to do anything too disk heavy like using rpm and yum cause the whole system to crash again. at this rate I may end up completely re-building it, but I need it stabilise soon as it host critical services.

    THE ANSWER

    OK my final troublesome CentOS VM has now been fixed. to remove the botched Linux Integration Driver and stabilise the system I did the following:

    1. I downloaded a CentOS Live CD and booted it up in the faulty VM.
    2. I then created a new folder with

      $mkdir /mnt/OS
      
    3. then mounted the root filesystem of the faulty OS (which in my case is sda2)

      $mount /dev/sda2 /mnt/OS
      
    4. I then mount the boot partition (which in my case is sda1)

      $mount /dev/sda1 /mnt/OS/boot
      
    5. I then chroot to the fault OS

      $chroot /mnt/OS
      
    6. and then ran rpm remove package command (ensuring all duplicates were removed in the process)

      $rpm -e --allmatches kmod-microsoft-hyper-v
      
    7. this process can take a little time but once done I could reboot the VM and my system was up and running without the integration drivers, but it now allows package updates without crashing.

    After this I think I'll leave the Hyper-V drivers out of the mix until I am comfortable that they work and are stable.

    • Admin
      Admin over 11 years
      Those messages mean that your kernel is having problems writing data to disk.
    • Admin
      Admin over 11 years
      All of those are indicating serious disk problems. Assuming you're sure your host machine is actually OK, as well as its attached storage system (or SAN, etc.), CentOS 6.2 is on the list of supported OSs. You should probably contact Microsoft.
    • Admin
      Admin over 11 years
      Also, I guess, you may want to try Serverfault. They probably have more experience with Hyper-V.
    • Admin
      Admin over 11 years
      EVA now fixed but the botched Hyper-v driver had me running around in circles. I've undated the main post with my resolution
  • Balasarius
    Balasarius over 11 years
    sorry this info is the last thing I get before the whole kernel locks up.... forgot to mention that bit. updating Question now