Writing a file using multiple threads

19,376

Solution 1

I am trying to write a single huge file in Java using multiple threads.

I would recommend that you have X threads reading from the database and a single thread writing to your output file. This is going to be much easier to implement as opposed to doing file locking and the like.

You could use a shared BlockingQueue (maybe ArrayBlockingQueue) so the database readers would add(...) to the queue and your writer would be in a take() loop on the queue. When the readers finish, they could add some special IM_DONE string constant and as soon as the writing thread sees X of these constants (i.e. one for each reader), it would close the output file and exit.

So then you can use a single BufferedWriter without any locks and the like. Chances are that you will be blocked by the database calls instead of the local IO. Certainly the extra thread isn't going to slow you down at all.

The single to-be-written file is opened by multiple threads in append mode. Each thread thereafter tries writing to the file file.

If you are adamant to have your reading threads also do the writing then you should add a synchronized block around the access to a single shared BufferedWriter -- you could synchronize on the BufferedWriter object itself. Knowing when to close the writer is a bit of an issue since each thread would have to know if the other one has exited. Each thread could increment a shared AtomicInteger when they run and decrement when they are done. Then the thread that looks at the run-count and sees 0 would be the one that would close the writer.

Solution 2

Instead of having a synchronized methods, the better solution would be to have a threadpool with single thread backed by a blocking queue. The message application would be writing will be pushed to blocking queue. The log writer thread would continue to read from blocking queue (will be blocked in case queue is empty) and would continue to write it to single file.

Share:
19,376
jayanth88
Author by

jayanth88

Updated on July 22, 2022

Comments

  • jayanth88
    jayanth88 almost 2 years

    I am trying to write a single huge file in Java using multiple threads.

    I have tried both FileWriter and bufferedWriter classes in Java.

    The content being written is actually an entire table (Postgres) being read using CopyManager and written. Each line in the file is a single tuple from the table and I am writing 100s of lines at a time.

    Approach to write:

    The single to-be-written file is opened by multiple threads in append mode. Each thread thereafter tries writing to the file file.

    Following are the issues I face:

    • Once a while, the contents of the file gets overwritten i.e: One line remains incomplete and the next line starts from there itself. My assumption here is that the buffers for writer are getting full. This forces the writer to immediately write the data onto the file. The data written may not be a complete line and before it can write the remainder, the next thread writes its content onto the file.
    • While using Filewriter, once a while I see a single black line in the file.

    Any suggestions, how to avoid this data integrity issue?