How to kill a hung service on Windows 2008R2

71,071

Even though it seems you've figured this out already, the problem is that the process is waiting on the Kernel for something. (This is usually a driver-level problem, but not always.) The only way to kill such a process is to unload the kernel, which, of course, you can't do without rebooting.

Might be worth trying some kernel debugging (does this tool work on 2008 R2?) in the hopes of narrowing down the specific cause or conflict, but your options for handling the problem are either living with it, or rebooting the server to eliminate it.

Is there a reason you haven't considered living with it? If it's just a zombie process, and it's not impacting anything, I'd think you could put off a reboot until a maintenance window or more opportune time. Typically my approach, when the zombie or hung process isn't interfering with anything - take care of it during the next patch cycle or scheduled maintenance window.

Share:
71,071

Related videos on Youtube

Kev
Author by

Kev

###Actively looking for freelance work ###About Me: I'm a professional software developer and have spent my time building provisioning and web based self-service systems for IIS, Apache and Citrix XenServer, amongst other things. My Curriculum Vitae can be viewed on Stack Overflow Careers (might be a bit out of date). Stuff I like to listen to at last.fm You can get in touch here: kevin.e.kenny #@# gmail.com (you know what to do with the # and spaces). No Survey Emails Please. Also not ashamed to admit I like trains, mostly diesels, late Era 8 (BR Sectorisation) and Era 9 onwards :) I'm also interested in signalling if anyone from Network Rail is looking this far down ;)

Updated on September 18, 2022

Comments

  • Kev
    Kev almost 2 years

    I have a Windows 2008R2 server running NSClient++. For some reason the service has got its knickers in a twist and stopped responding to Nagios polling.

    When I tried to restart the service the service manager takes a long time to try and kill the service then eventually gives up with a message along the lines of "the service took too long to respond". But...it also starts a new instance of the service.

    If I look in Task Manager or tasklist I can now see two instances of nsclient++.exe running.

    I tried to kill both of these using:

    • right click and "End Process" in task manager - pretends to kill the process and reports no errors (for example Access Denied) but the process is still there.

    • taskkill /PID <proc id> /F - reports SUCCESS: The process with PID 6672 has been terminated. but the process is still running.

    • downloaded SysInternals PsTools and ran pskill <PID> - reports Process <PID> killed - yet the process is still there.

    • execute at hh:mm pskill <PID> to get pskill to do this as the SYSTEM account ... and you guessed it the process is still running.

    All of the above were run in an Administrator command prompt.

    Other than a reboot which is not really ideal (the box is a fairly mission critical production server), what else can I try?

    The server isn't under any resource pressure (memory, CPU, disk etc) and everything running on it is chugging along just fine.

    As quick look at the threads tab in SysInternals Process Explorer shows that all of these nsclient++.exe instances are stuck unloading:

    enter image description here

    As an aside, I also tried killing all of the TCP connections for these zombie(?) processes (with TCPView) in the hope that I could start a new instance and it would be able to grab port 5666. Then we could reboot the server when things are quieter, but alas that didn't work.

  • Kev
    Kev almost 12 years
    Sadly too late to examine these processes in WinDbg, the infrastructure guys have rebooted the server. But handy to know for next time.
  • Kev
    Kev almost 12 years
    The other problem was that we couldn't live with it like this. The service is NSClient++ which we use in conjunction with nagios. I couldn't even get a fresh service exe to run and respond to polling requests, I think because these zombied processes were still hanging onto port 5666 which it listens on (could certainly see one of them still holding onto the port in TCPView and I couldn't close it).
  • HopelessN00b
    HopelessN00b almost 12 years
    Well, that's certainly a very good reason not to live with it.
  • Simon Catlin
    Simon Catlin almost 12 years
    If it happens again, don't forget another one of Mark Russinovich's babies - Process Monitor. Point procmon at the process to see what it's doing. Wonderful tool.
  • Kev
    Kev almost 12 years
    @SimonCatlin - aye, I did that too but nothing really jumped out at me.