IIS: How to tell if a slow time-taken is due to a slow network connection
I have a situation where a client will say, "I sent a request to your web server at 10:03:24 and it took 20 seconds, why?". I can see this in the IIS logs as well, but the server's ASP.NET module logged it as taking 100ms, and CPU and Disk counters were low.
I suspect that it's due to a slow network connection. How can I prove this?
It starts with looking for packet drops between your client's browser and all the sources of images / scripts / html for the aforementioned web page. If you find consistent packet drops, then you know for sure there is something in the network that needs to be fixed... even if it is just a link that's overloaded. Packet drops are not the only reason for a slow network, but it's the most common source in my experience. Other sources could be a misconfigured proxy or cache engine. Sadly, I can't list all possible network culprits here.
However, people often blame the network, when in-fact the speed issues are well-within their own control. Possible explanations:
- Suppose the HTML for that page was written poorly and it loads required scripts in the wrong order so the whole page renders slowly, even though almost all resources were in-place.
- The page is waiting for a resource that simply doesn't exist and times-out while waiting.
- A script is in a slow loop that blocks for a while
- A cache engine takes a long time delivering an image
- Your CGI is looking up something in a database, and the lookup itself is slow
- You're using google analytics, which slows things down due to the way the page is written
I could go on, but the point is you have to nail down the exact reason for why the page is slow yourself. A flawed network is possible; it is also possible that other factors are contributing to the slow performance.
To diagnose further:
- If the page loads well in Firefox, then the Network tab in Firebug is your friend (Hit F12, then go to the Network tab and reload the page). Firebug gives you a nice waterfall diagram for how the page loads and where the delays are
- If the page loads well in Chrome, you can do something similar (Hit CntlShiftI, click on the network tab, and reload the page).
- If the page is only supported in IE (btw, shame on your HTML developers), your best bet is to start loading each of these ASP page elements individually with
curl
until you find something that looks way too slow, then find out why that particular element is slow.
BTW, the Chrome and Firefox examples used a CGI query from Debian.org; this is a good example of a delay that comes from a CGI lookup.
When all else fails, you can get a .pcap
from wireshark and run it through tcptrace
; however, while tcptrace
is very good at analyzing packet dumps, there are no guarantees that you can isolate the issue with tcptrace
alone. See this answer for information on using tcptrace
diagnostics.
Related videos on Youtube
Jon
Updated on September 18, 2022Comments
-
Jon over 1 year
According to http://support.microsoft.com/kb/944884, "when a large response or large responses are sent to a client over a slow network connection, the value of the time-taken field may be more than expected".
I have a situation where a client will say, "I sent a request to your web server at 10:03:24 and it took 20 seconds, why?". I can see this in the IIS logs as well, but the server's ASP.NET module logged it as taking 100ms, and CPU and Disk counters were low.
I suspect that it's due to a slow network connection. How can I prove this?
Update:
1) These are SOAP Web Service requests, therefore no embedded graphics, just an HTTP POST with a single XML page of results.
2) Also, I've reproduced this by throttling network speed on the client side and the symptoms are exactly the same.
3) The problem is intermittent, meaning the same request is normally fast for the client but occasionally slow. I can't reproduce this myself other than by throttling the network. The server's ASP.NET logging shows it always fast, but IIS logging shows it slow when the client says it's slow.
4) I only have access to the server, and need to provide as much information as possible to the client so they accept that the issue was not on the server and know what logging/tools to run on the client to find root cause.
-
David Schwartz almost 12 yearsAre these requests normal page views that require fetching embedding graphics and so on? Or are they automated queries that return only a single page? Are we actually measuring the time to load a page or the time to respond to a single HTTP request?
-
-
Jon almost 12 yearsThanks, but it's not reproducible other than by throttling network speed, and a packet capture is too heavy-weight to use in production.
-
Jon almost 12 yearsSee my updates above. While your info is very useful in the general case, I don't think it applies here. The page is only intermittently slow, and the symptoms are only reproducible when I throttle the network at the client side.
-
Mike Pennington almost 12 yearsthe waterfall charts in firefox / chrome support http post operations, as well as curl... I am not sure how you concluded that the info doesnt apply, but it would seem that it doesnt involve a full application of the tools against the problem domain.
-
Jon almost 12 yearsFirefox/chrome are client-side tools. I only have access to the server, and I can't repro using my own client. I need to tell, from the server only, if a particular request was slow due to network issues. That leaves packet capturing, but that is too heavy to leave on in production (consider 1 in 10,000 requests might be slow).
-
Mike Pennington almost 12 yearsAs a network engineer with over 15 years under my belt, may I respectfully suggest that you cannot diagnose a client-side HTTP services problem from the server alone; you simply don't have enough information (which is apparently your conclusion too... however, you don't seem to be open to living with this reality :-).
-
Jon almost 12 yearsIf packet capturing at the server can diagnose network issues (eg via seeing a slow TCP ack), is it not reasonable to expect a lighter-weight tool/logger could show the same?
-
Mike Pennington almost 12 yearsHow is this imaginary tool supposed to form conclusions about why that ack is slow... there are multiple possible causes... for instance: 1) network throttling 2) Host CPU overloaded 3) Host swapping to disk 4) Foobar'd TCP implementation on host 5) Foobar'd HTTP proxy inbetween
-
Jon almost 12 yearsJust the information that ack was slow would be useful to give to the client. At the moment I can't even provide that.
-
Mike Pennington almost 12 yearsare you assuming the ACK is slow, or investigating whether an ACK is slow? If it is the former case, please save yourself heartache and do not assume you know the root cause before you find it. As for diagnosing pcap files from wireshark... the best "tool" out there is tcptrace. However, it's not guaranteed to show you the root cause for any random pcap file
-
Jon almost 12 yearsI'm trying to determine how to prove that occasional slowness is not due to issues at the server, and to provide as much information as possible to the client so they accept this and know what logging/tools to run on their end to find root cause.
-
Jon almost 9 yearsYou could improve this answer by answering "how to tell". w3wp.exe going to sleep is not relevant in my case as I've disabled that behavior, but this could help others.