Windows server losing connectivity

6,157

Solution 1

It's gone as suddenly as it came. The switch has been replaced in the meantime, but the system has shown to be running for a couple of days without issues before.

Edit: while I have no decent proof for this, it looks very much like the responsible piece of software has been the WinGate proxy installed on the same server. Users' reports indicate that issues were observed after the product has been updated and disappeared after it has been uninstalled. A similar scenario involving TDI filter drivers has been documented here, although the resolution only covers Windows Server 2008. And WinGate seems to be employing TDI indeed.

Solution 2

I would do two things to try to fix the problem:

First, remove the AV. Completely. Don't just disable one part or another, uninstall it. Second, assuming it still fails from time to time, change the NIC and cables.

Share:
6,157

Related videos on Youtube

the-wabbit
Author by

the-wabbit

Updated on September 18, 2022

Comments

  • the-wabbit
    the-wabbit almost 2 years

    A Windows Server 2008 R2 (SBS 2011 really) with Service Pack 1 started to expose a network connectivity problem all of a sudden which seems rather hard to debug:

    Occasionally (always during business hours, approximately 1-2 times a day) network connectivity is lost. The host itself keeps on running, I can use the console interactively. The Ethernet link as indicated by the LEDs in switch management and the NIC keeps being up. The IP configuration is still attached to the interface and looks valid (ipconfig produces a sane output). However, not even ARP lookups are able to complete successfully.

    A list of things which did not help matters:

    • resetting the switch port or the switch
    • disabling / re-enabling the server's interface (either in ncpa.cpl or in device manager)
    • unplugging / re-plugging the network cable

    Shutting down and restarting the server always helps - it is usable as ever after startup.

    Things checked:

    • the event logs do not list any suspicious events
    • the switch's network port counters do not show errors
    • the network connection does not show any signs of queerness (losses, latencies, bad performance) as long as data is flowing
    • the cable and the NIC have been replaced to rule out an obvious hardware failure
    • power management for the NIC has been disabled in the device properties (device manager)
    • the NIC used is an Intel PRO/1000 CT with the Intel 82574L network processor (the same type is used in the on-board NIC ports)
    • the anti-viral suite from AVG (Anti-Virus Business Edition) is installed on the server, but the Firewall component has been removed when installing it, so it should not interfere here

    As a hardware failure seems unlikely, I am trying to determine what software component could be responsible for messing up the network stack this badly. Is there any sane way to find out which drivers are in the network stack and thus possibly interfering? Has anybody seen anything similar before? Any ideas possibly leading to the resolution welcome.

    • Rex
      Rex about 11 years
      Have you tried a different physical switch (or at the very least, a different switch port)?
  • the-wabbit
    the-wabbit about 11 years
    Thank you for your suggestions. We already tried replacing the NIC (actually adding a different one) and the cable - I mentioned it in one of the bullet points of the question. And as this is a customer's install, I would refrain from uninstalling stuff she has installed unless there is any indication that it might be the culprit. Do you have any idea on how to check what kernel drivers would be able to interfere here?
  • Stephane
    Stephane about 11 years
    Sorry, I missed that specific bullet item. Short of uninstalling (or disabling in the registry, at least) each driver and testing, I don't see any obvious way to find which one is problematic. But honestly, I'm willing to bet that your problem is with the AV software.
  • Adrien
    Adrien almost 8 years
    Actually WinGate doesn't hook at TDI-level. It has an NDIS LWF (Light Weight Filter), which is like an intermediate packet-level driver. Otherwise it's a socket application. All Windows sockets connections go down the stack via TDI on Windows 2008. It's possible with incorrect configuration to create connectivity issues with any proxy. I hope you were able to resolve your issue with our tech support. Disclaimer; I work for Qbik who are the authors of WinGate