IIS: How to tell if a slow time-taken is due to a slow network connection

22,085

I have a situation where a client will say, "I sent a request to your web server at 10:03:24 and it took 20 seconds, why?". I can see this in the IIS logs as well, but the server's ASP.NET module logged it as taking 100ms, and CPU and Disk counters were low.

I suspect that it's due to a slow network connection. How can I prove this?

It starts with looking for packet drops between your client's browser and all the sources of images / scripts / html for the aforementioned web page. If you find consistent packet drops, then you know for sure there is something in the network that needs to be fixed... even if it is just a link that's overloaded. Packet drops are not the only reason for a slow network, but it's the most common source in my experience. Other sources could be a misconfigured proxy or cache engine. Sadly, I can't list all possible network culprits here.

However, people often blame the network, when in-fact the speed issues are well-within their own control. Possible explanations:

  • Suppose the HTML for that page was written poorly and it loads required scripts in the wrong order so the whole page renders slowly, even though almost all resources were in-place.
  • The page is waiting for a resource that simply doesn't exist and times-out while waiting.
  • A script is in a slow loop that blocks for a while
  • A cache engine takes a long time delivering an image
  • Your CGI is looking up something in a database, and the lookup itself is slow
  • You're using google analytics, which slows things down due to the way the page is written

I could go on, but the point is you have to nail down the exact reason for why the page is slow yourself. A flawed network is possible; it is also possible that other factors are contributing to the slow performance.

To diagnose further:

  • If the page loads well in Firefox, then the Network tab in Firebug is your friend (Hit F12, then go to the Network tab and reload the page). Firebug gives you a nice waterfall diagram for how the page loads and where the delays are Firebug waterfall
  • If the page loads well in Chrome, you can do something similar (Hit CntlShiftI, click on the network tab, and reload the page). Chrome
  • If the page is only supported in IE (btw, shame on your HTML developers), your best bet is to start loading each of these ASP page elements individually with curl until you find something that looks way too slow, then find out why that particular element is slow.

BTW, the Chrome and Firefox examples used a CGI query from Debian.org; this is a good example of a delay that comes from a CGI lookup.

When all else fails, you can get a .pcap from wireshark and run it through tcptrace; however, while tcptrace is very good at analyzing packet dumps, there are no guarantees that you can isolate the issue with tcptrace alone. See this answer for information on using tcptrace diagnostics.

Share:
22,085

Related videos on Youtube

Jon
Author by

Jon

Updated on September 18, 2022

Comments

  • Jon
    Jon over 1 year

    According to http://support.microsoft.com/kb/944884, "when a large response or large responses are sent to a client over a slow network connection, the value of the time-taken field may be more than expected".

    I have a situation where a client will say, "I sent a request to your web server at 10:03:24 and it took 20 seconds, why?". I can see this in the IIS logs as well, but the server's ASP.NET module logged it as taking 100ms, and CPU and Disk counters were low.

    I suspect that it's due to a slow network connection. How can I prove this?

    Update:

    1) These are SOAP Web Service requests, therefore no embedded graphics, just an HTTP POST with a single XML page of results.

    2) Also, I've reproduced this by throttling network speed on the client side and the symptoms are exactly the same.

    3) The problem is intermittent, meaning the same request is normally fast for the client but occasionally slow. I can't reproduce this myself other than by throttling the network. The server's ASP.NET logging shows it always fast, but IIS logging shows it slow when the client says it's slow.

    4) I only have access to the server, and need to provide as much information as possible to the client so they accept that the issue was not on the server and know what logging/tools to run on the client to find root cause.

    • David Schwartz
      David Schwartz almost 12 years
      Are these requests normal page views that require fetching embedding graphics and so on? Or are they automated queries that return only a single page? Are we actually measuring the time to load a page or the time to respond to a single HTTP request?
  • Jon
    Jon almost 12 years
    Thanks, but it's not reproducible other than by throttling network speed, and a packet capture is too heavy-weight to use in production.
  • Jon
    Jon almost 12 years
    See my updates above. While your info is very useful in the general case, I don't think it applies here. The page is only intermittently slow, and the symptoms are only reproducible when I throttle the network at the client side.
  • Mike Pennington
    Mike Pennington almost 12 years
    the waterfall charts in firefox / chrome support http post operations, as well as curl... I am not sure how you concluded that the info doesnt apply, but it would seem that it doesnt involve a full application of the tools against the problem domain.
  • Jon
    Jon almost 12 years
    Firefox/chrome are client-side tools. I only have access to the server, and I can't repro using my own client. I need to tell, from the server only, if a particular request was slow due to network issues. That leaves packet capturing, but that is too heavy to leave on in production (consider 1 in 10,000 requests might be slow).
  • Mike Pennington
    Mike Pennington almost 12 years
    As a network engineer with over 15 years under my belt, may I respectfully suggest that you cannot diagnose a client-side HTTP services problem from the server alone; you simply don't have enough information (which is apparently your conclusion too... however, you don't seem to be open to living with this reality :-).
  • Jon
    Jon almost 12 years
    If packet capturing at the server can diagnose network issues (eg via seeing a slow TCP ack), is it not reasonable to expect a lighter-weight tool/logger could show the same?
  • Mike Pennington
    Mike Pennington almost 12 years
    How is this imaginary tool supposed to form conclusions about why that ack is slow... there are multiple possible causes... for instance: 1) network throttling 2) Host CPU overloaded 3) Host swapping to disk 4) Foobar'd TCP implementation on host 5) Foobar'd HTTP proxy inbetween
  • Jon
    Jon almost 12 years
    Just the information that ack was slow would be useful to give to the client. At the moment I can't even provide that.
  • Mike Pennington
    Mike Pennington almost 12 years
    are you assuming the ACK is slow, or investigating whether an ACK is slow? If it is the former case, please save yourself heartache and do not assume you know the root cause before you find it. As for diagnosing pcap files from wireshark... the best "tool" out there is tcptrace. However, it's not guaranteed to show you the root cause for any random pcap file
  • Jon
    Jon almost 12 years
    I'm trying to determine how to prove that occasional slowness is not due to issues at the server, and to provide as much information as possible to the client so they accept this and know what logging/tools to run on their end to find root cause.
  • Jon
    Jon almost 9 years
    You could improve this answer by answering "how to tell". w3wp.exe going to sleep is not relevant in my case as I've disabled that behavior, but this could help others.