At what point does wrapping a FileOutputStream with a BufferedOutputStream make sense, in terms of performance?

23,622

BufferedOutputStream helps when the writes are smaller than the buffer size e.g. 8 KB. For larger writes it doesn't help nor does it make it much worse. If ALL your writes are larger than the buffer size or you always flush() after every write, I would not use a buffer. However if a good portion of your writes are less that the buffer size and you don't use flush() every time, its worth having.

You may find increasing the buffer size to 32 KB or larger gives you a marginal improvement, or make it worse. YMMV


You might find the code for BufferedOutputStream.write useful

/**
 * Writes <code>len</code> bytes from the specified byte array
 * starting at offset <code>off</code> to this buffered output stream.
 *
 * <p> Ordinarily this method stores bytes from the given array into this
 * stream's buffer, flushing the buffer to the underlying output stream as
 * needed.  If the requested length is at least as large as this stream's
 * buffer, however, then this method will flush the buffer and write the
 * bytes directly to the underlying output stream.  Thus redundant
 * <code>BufferedOutputStream</code>s will not copy data unnecessarily.
 *
 * @param      b     the data.
 * @param      off   the start offset in the data.
 * @param      len   the number of bytes to write.
 * @exception  IOException  if an I/O error occurs.
 */
public synchronized void write(byte b[], int off, int len) throws IOException {
    if (len >= buf.length) {
        /* If the request length exceeds the size of the output buffer,
           flush the output buffer and then write the data directly.
           In this way buffered streams will cascade harmlessly. */
        flushBuffer();
        out.write(b, off, len);
        return;
    }
    if (len > buf.length - count) {
        flushBuffer();
    }
    System.arraycopy(b, off, buf, count, len);
    count += len;
}
Share:
23,622
Thomas Owens
Author by

Thomas Owens

Professionally, I'm a software engineer focusing on agile and lean software development and software process improvement. I work to help engineers, teams, and organizations be successful. I have experience with a wide variety of types of software, ranging from embedded systems to desktop applications to web applications. In my spare time, I'm a runner, a photographer, and a casual gamer. Find me on LinkedIn, Twitter, Reddit, Medium, GitHub, Quora, and ProjectManagement.com. Support my freely available (CC BY and CC BY-SA) content through Patreon, PayPal, Buy me a Coffee, or my Amazon Wishlist.

Updated on January 31, 2020

Comments

  • Thomas Owens
    Thomas Owens over 4 years

    I have a module that is responsible for reading, processing, and writing bytes to disk. The bytes come in over UDP and, after the individual datagrams are assembled, the final byte array that gets processed and written to disk is typically between 200 bytes and 500,000 bytes. Occassionally, there will be byte arrays that, after assembly, are over 500,000 bytes, but these are relatively rare.

    I'm currently using the FileOutputStream's write(byte\[\]) method. I'm also experimenting with wrapping the FileOutputStream in a BufferedOutputStream, including using the constructor that accepts a buffer size as a parameter.

    It appears that using the BufferedOutputStream is tending toward slightly better performance, but I've only just begun to experiment with different buffer sizes. I only have a limited set of sample data to work with (two data sets from sample runs that I can pipe through my application). Is there a general rule-of-thumb that I might be able to apply to try to calculate the optimal buffer sizes to reduce disk writes and maximize the performance of the disk writing given the information that I know about the data I'm writing?

  • Thomas Owens
    Thomas Owens over 12 years
    Something I haven't found yet - what is the default buffer size of the BufferedOutputStream in Java 6? You mention 8KB - is that the default in Java? The Javadocs for 1.4.2 say the buffer is 512 bytes, meaning most of what I write tends to fall between 200 and 400 bytes per array. However, this information is removed from the Java 6 documentation.
  • gustafc
    gustafc over 12 years
    @Thomas - looking at the source code, the default size is 8192. I'd assume they removed the default size specification to be able to change it when a new "most sensible default" appears. If having a specific buffer size is important, you'll probably want to specify it explicitly.
  • Thomas Owens
    Thomas Owens over 12 years
    @gustafc Thanks. I always forget that I can look at the Java source code.
  • Thomas Owens
    Thomas Owens over 12 years
    My other question is if a write that is greater than the buffer size is worse performing than a non-buffered write. I can't think of a reason why it would be significantly worse, although the greater over the buffer size it is, the more times the buffer gets full, written, and full again. So I might need to experiment with that as well.
  • GOTO 0
    GOTO 0 almost 7 years
    I ran similar tests and I can confirm that using a BufferedOutputStream makes writing files not faster but slower, most likely because the data being written is already cached at multiple levels on its way from the JVM through the OS to the physical medium.
  • Dev Amitabh
    Dev Amitabh almost 7 years
    @GOTO Thanks for confirming. Are there any resources you might be aware of, that can help me dig deeper into how IO and internal caches work?
  • GOTO 0
    GOTO 0 almost 7 years
    Not really. If it helps googling, the file caching components are called Cache Manager in Windows and Page Cache in Linux. Hard disks and other storage devices also come with different sorts of I/O caches (though the basics are probably the same).