Can multiple threads write data into a file at the same time?

57,501

Solution 1

You can use multiple threads writing a to a file e.g. a log file. but you have to co-ordinate your threads as @Thilo points out. Either you need to synchronize file access and only write whole record/lines, or you need to have a strategy for allocating regions of the file to different threads e.g. re-building a file with known offsets and sizes.

This is rarely done for performance reasons as most disk subsystems perform best when being written to sequentially and disk IO is the bottleneck. If CPU to create the record or line of text (or network IO) is the bottleneck it can help.

Image that you want to dump a big database table to a file, and how to make this job faster?

Writing it sequentially is likely to be the fastest.

Solution 2

Java nio package was designed to allow this. Take a look for example at http://docs.oracle.com/javase/1.5.0/docs/api/java/nio/channels/FileChannel.html .

You can map several regions of one file to different buffers, each buffer can be filled separately by a separate thread.

Solution 3

The synchronized declaration enables doing this. Try the below code which I use in a similar context.

package hrblib;

import java.io.*;

public class FileOp {

    static int nStatsCount = 0;

    static public String getContents(String sFileName) {  

        try {
            BufferedReader oReader = new BufferedReader(new FileReader(sFileName));
            String sLine, sContent = "";
            while ((sLine=oReader.readLine()) != null) {
                sContent += (sContent=="")?sLine: ("\r\n"+sLine);
            }
            oReader.close();
            return sContent;
        }
        catch (IOException oException) {
            throw new IllegalArgumentException("Invalid file path/File cannot be read: \n" + sFileName);
        }
    }
    static public void setContents(String sFileName, String sContent) {
        try {
            File oFile = new  File(sFileName);
            if (!oFile.exists()) {
                oFile.createNewFile();
            }
            if (oFile.canWrite()) {
                BufferedWriter oWriter = new BufferedWriter(new FileWriter(sFileName));
                oWriter.write (sContent);
                oWriter.close();
            }
        }
        catch (IOException oException) {
            throw new IllegalArgumentException("Invalid folder path/File cannot be written: \n" + sFileName);
        }
    }
    public static synchronized void appendContents(String sFileName, String sContent) {
        try {

            File oFile = new File(sFileName);
            if (!oFile.exists()) {
                oFile.createNewFile();
            }
            if (oFile.canWrite()) {
                BufferedWriter oWriter = new BufferedWriter(new FileWriter(sFileName, true));
                oWriter.write (sContent);
                oWriter.close();
            }

        }
        catch (IOException oException) {
            throw new IllegalArgumentException("Error appending/File cannot be written: \n" + sFileName);
        }
    }
}

Solution 4

You can have multiple threads write to the same file - but one at a time. All threads will need to enter a synchronized block before writing to the file.

In the P2P example - one way to implement it is to find the size of the file and create a empty file of that size. Each thread is downloading different sections of the file - when they need to write they will enter a synchronized block - move the file pointer using seek and write the contents of the buffer.

Solution 5

What kind of file is this? Why do you need to feed it with more threads? It depends on the characteristics (I don't know better word for it) of the file usage.

Transferring a file from several places over network (short: Torrent-like)

If you are transferring an existing file, the program should

  • as soon, as it gets know the size of the file, create it with empty content: this prevents later out-of-disk error (if there's not enough space, it will turns out at the creation, before downloading anything of it), also it helps the the performance;
  • if you organize the transfer well (and why not), each thread will responsible for a distinct portion of the file, thus file writes will be distinct,
  • even if somehow two threads pick the same portion of the file, it will cause no error, because they write the same data for the same file positions.

Appending data blocks to a file (short: logging)

If the threads just appends fixed or various-lenght info to a file, you should use a common thread. It should use a relatively large write buffer, so it can serve client threads quick (just taking the strings), and flush it out optimal scheduling and block size. It should use dedicated disk or even computer.

Also, there can be several performance issues, that's why are there logging servers around, even expensive commercial ones.

Reading and writing random time, random position (short: database)

It requires complex design, with mutexes etc., I never done this kinda stuff, but I can imagine. Ask Oracle for some tricks :)

Share:
57,501
CaiNiaoCoder
Author by

CaiNiaoCoder

I am a Java developer. I live in China. I like playing football My English is not good.

Updated on February 24, 2020

Comments

  • CaiNiaoCoder
    CaiNiaoCoder about 4 years

    If you have ever used a p2p downloading software, they can download a file with multi-threading, and they created only one file, So I wonder how the threads write data into that file. Sequentially or in parallel?

    Imagine that you want to dump a big database table to a file, and how to make this job faster?