Slow transfers in Jetty with chunked transfer encoding at certain buffer size

14,874

Solution 1

I believe I have found the answer myself, by looking through the Jetty source code. It's actually a complex interplay of the response buffer size, the size of the buffer passed to outStream.write, and whether or not outStream.flush is called (in some situations). The issue is with the way Jetty uses its internal response buffer, and how the data you write to the output is copied to that buffer, and when and how that buffer is flushed.

If the size of the buffer used with outStream.write is equal to the response buffer (I think a multiple also works), or less and outStream.flush is used, then performance is fine. Each write call is then flushed straight to the output, which is fine. However, when the write buffer is larger and not a multiple of the response buffer, this seems to cause some weirdness in how the flushes are handled, causing extra flushes, leading to bad performance.

In the case of chunked transfer encoding, there's an extra kink in the cable. For all but the first chunk, Jetty reserves 12 bytes of the response buffer to contain the chunk size. This means that in my original example with a 64KB write and response buffer, the actual amount of data that fit in the response buffer was only 65524 bytes, so again, parts of the write buffer were spilling into multiple flushes. Looking at a captured network trace of this scenario, I see that the first chunk is 64KB, but all subsequent chunks are 65524 bytes. In this case, outStream.flush makes no difference.

When using a 4KB buffer I was seeing fast speeds only when outStream.flush was called. It turns out that resp.setBufferSize will only increase the buffer size, and since the default size is 24KB, resp.setBufferSize(4096) is a no-op. However, I was now writing 4KB pieces of data, which fit in the 24KB buffer even with the reserved 12 bytes, and are then flushed as a 4KB chunk by the outStream.flush call. However, when the call to flush is removed, it will let the buffer fill up, again with 12 bytes spilling into the next chunk because 24 is a multiple of 4.

In conclusion

It seems that to get good performance with Jetty, you must either:

  • When calling setContentLength (no chunked transfer encoding) and use a buffer for write that's the same size as the response buffer size.
  • When using chunked transfer encoding, use a write buffer that's at least 12 bytes smaller than the response buffer size, and call flush after each write.

Note that the performance of the "slow" scenario is still such that you'll likely only see the difference on the local host or very fast (1Gbps or more) network connection.

I guess I should file issue reports against Hadoop and/or Jetty for this.

Solution 2

Yes, Jetty will default to Transfer-Encoding: Chunked if the size of response cannot be determined.

If you know the size of response that what its going to be. You need to call resp.setContentLength(135*1000*1000*1000); in this case instead of

resp.setBufferSize();

actually setting resp.setBufferSize is immaterial.

Before opening the OutputStream, that is before this line: OutputStream outStream = resp.getOutputStream(); you need to call resp.setContentLength(135*1000*1000*1000);

(the line above)

Give it a spin. see if that works. Those are my guesses from theory.

Share:
14,874

Related videos on Youtube

Sven
Author by

Sven

Updated on June 04, 2022

Comments

  • Sven
    Sven almost 2 years

    I'm investigating a performance problem with Jetty 6.1.26. Jetty appears to use Transfer-Encoding: chunked, and depending on the buffer size used, this can be very slow when transferring locally.

    I've created a small Jetty test application with a single servlet that demonstrates the issue.

    import java.io.File;
    import java.io.FileInputStream;
    import java.io.IOException;
    import java.io.OutputStream;
    
    import javax.servlet.ServletException;
    import javax.servlet.http.HttpServlet;
    import javax.servlet.http.HttpServletRequest;
    import javax.servlet.http.HttpServletResponse;
    
    import org.mortbay.jetty.Server;
    import org.mortbay.jetty.nio.SelectChannelConnector;
    import org.mortbay.jetty.servlet.Context;
    
    public class TestServlet extends HttpServlet {
    
        @Override
        protected void doGet(HttpServletRequest req, HttpServletResponse resp)
                throws ServletException, IOException {
            final int bufferSize = 65536;
            resp.setBufferSize(bufferSize);
            OutputStream outStream = resp.getOutputStream();
    
            FileInputStream stream = null;
            try {
                stream = new FileInputStream(new File("test.data"));
                int bytesRead;
                byte[] buffer = new byte[bufferSize];
                while( (bytesRead = stream.read(buffer, 0, bufferSize)) > 0 ) {
                    outStream.write(buffer, 0, bytesRead);
                    outStream.flush();
                }
            } finally   {
                if( stream != null )
                    stream.close();
                outStream.close();
            }
        }
    
        public static void main(String[] args) throws Exception {
            Server server = new Server();
            SelectChannelConnector ret = new SelectChannelConnector();
            ret.setLowResourceMaxIdleTime(10000);
            ret.setAcceptQueueSize(128);
            ret.setResolveNames(false);
            ret.setUseDirectBuffers(false);
            ret.setHost("0.0.0.0");
            ret.setPort(8080);
            server.addConnector(ret);
            Context context = new Context();
            context.setDisplayName("WebAppsContext");
            context.setContextPath("/");
            server.addHandler(context);
            context.addServlet(TestServlet.class, "/test");
            server.start();
        }
    
    }
    

    In my experiment, I'm using a 128MB test file that the servlet returns to the client, which connects using localhost. Downloading this data using a simple test client written in Java (using URLConnection) takes 3.8 seconds, which is very slow (yes, it's 33MB/s, which doesn't sound slow, except that this is purely local and the input file was cached; it should be much faster).

    Now here's where it gets strange. If I download the data with wget, which is a HTTP/1.0 client and therefore doesn't support chunked transfer encoding, it only takes 0.1 seconds. That's a much better figure.

    Now when I change bufferSize to 4096, the Java client takes 0.3 seconds.

    If I remove the call to resp.setBufferSize entirely (which appears to use a 24KB chunk size), the Java client now takes 7.1 seconds, and wget is suddenly equally slow!

    Please note I'm not in any way an expert with Jetty. I stumbled across this problem while diagnosing a performance problem in Hadoop 0.20.203.0 with reduce task shuffling, which transfers files using Jetty in a manner much like the reduced sample code, with a 64KB buffer size.

    The problem reproduces both on our Linux (Debian) servers and on my Windows machine, and with both Java 1.6 and 1.7, so it appears to depend solely on Jetty.

    Does anyone have any idea what could be causing this, and if there's something I can do about it?

    • Thomas Jungblut
      Thomas Jungblut over 12 years
      +1. I have also observed this. But didn't really get a good solution.
  • Bob Kuhar
    Bob Kuhar over 12 years
    On second thought...128m seems to be the default. So my suggestion isn't going to help. I'll be downvoted into oblivion. Oh the humanity.
  • Bob Kuhar
    Bob Kuhar over 12 years
    These guys say that managing buffer size on your own may not be such a great idea: stackoverflow.com/questions/4638974/…
  • Sven
    Sven over 12 years
    Yet using the default buffer size on the HttpServletResponse actually gave me the worst performance.
  • Sven
    Sven over 12 years
    Thanks for your reply. resp.setContentLength without resp.setBufferSize is still slow. However, with resp.setContentLength both buffer sizes I tried (64KB and 4KB) are now fast. Also see the update to the question.
  • Sven
    Sven over 12 years
    Actually, it appears that the default buffer size is fast if and only if the size of the buffer passed to outStream.write is less than 24KB (the default response buffer size).
  • manocha_ak
    manocha_ak over 12 years
    forgot that i commented the flush above in case of no chunking (setting content-lenght)
  • Tim
    Tim over 12 years
    I doubt you find anyone in the jetty team that is particularly responsive to a bug report on Jetty 6. But if the same issue exists in Jetty 7 or 8, then a bug report would be highly appreciated.
  • 4ntoine
    4ntoine almost 9 years
    how can i make Jetty to refuse using Transfer-Encoding:Chunked by default. Some some client are ancient and they require 'Content-Length' header in response