Reading multiple files in a Stream

11,078

Solution 1

If all you're doing is reading files and then concatenating them together to a new file on disk, you might not need to write code at all. Use the Windows copy command:

C:\> copy a.txt+b.txt+c.txt+d.txt output.txt

You can call this via Process.Start if you want.

This, of course, assumes that you're not doing any custom logic on the files or their content.

Solution 2

For one thing, you need to differentiate between streams (binary data) and StreamReaders or more generally TextReaders (text data).

It sounds like you want to create a subclass of TextReader which will accept (in its constructor) a bunch of TextReader parameters. You don't need to eagerly read anything here... but in the Read methods that you override, you should read from "the current" reader until that's exhausted, then start on the next one. Bear in mind that Read doesn't have to fill the buffer it's been given - so you could do something like:

while (true)
{
    int charsRead = currentReader.Read(buffer, index, size);
    if (charsRead != 0)
    {
        return charsRead;
    }
    // Adjust this based on how you store the readers...
    if (readerQueue.Count == 0)
    {
        return 0;
    }
    currentReader = readerQueue.Dequeue();
}

I strongly suspect there are already third party libraries to do this sort of demuxing, mind you...

Solution 3

This should be fast (but it'll load the entire files in memory, so might not fit with every need):

string[] files = { @"c:\a.txt", @"c:\b.txt", @"c:\c.txt" };

FileStream outputFile = new FileStream(@"C:\d.txt", FileMode.Create);

using (BinaryWriter ws = new BinaryWriter(outputFile))
{
    foreach (string file in files)
    {
        ws.Write(System.IO.File.ReadAllBytes(file));
    }
}
Share:
11,078
Corovei Andrei
Author by

Corovei Andrei

I am a student at the Techincal University of Cluj Napoca, department of Computer Science. Yahoo mail: [email protected]

Updated on November 21, 2022

Comments

  • Corovei Andrei
    Corovei Andrei over 1 year

    Hei!

    How can I read multiple text files at once? What I want to do is read a series of files and append all of them to one big file. Curently I am doing this:

    1. take each file and open it with a StreamReader
    2. read the StreamReader completely in a StringBuilder and append it to the current StreamBuilder
    3. check if the memory size is exceeded and if yes write the StringBuilder at the end of the file and empty the StrigBuilder

    Unfortunately, I observed that the reading speed avg is only 4MB/sec. I noticed that when I move files around the disk I get a speed of 40 MB/sec. I am thinking of buffering the files in a Stream and reading them all at once as I do with the writting. Any idea how can I achieve this?

    Update:

     foreach (string file in System.IO.Directory.GetFiles(InputPath))
            {
                using (StreamReader sr = new StreamReader(file))
                {
    
                    try
                    {
                        txt = txt+(file + "|" + sr.ReadToEnd());
                    }
                    catch // out of memory exception 
                    {
                        WriteString(outputPath + "\\" + textBox3.Text, ref txt);
                        //sb = new StringBuilder(file + "|" + sr.ReadToEnd());
                        txt = file + "|" + sr.ReadToEnd();
                    }
    
                }
    
                Application.DoEvents();
            }
    

    This is how I'm doing it now.

    • svick
      svick about 12 years
      What version of .Net are you using?
    • Joe
      Joe about 12 years
      Post code. The stream classes in .NET can do much better than this. Also, depending on .NET versions, there are methods on streams to directly copy from one stream to another via .CopyTo that don't require an intermediary.
    • Corovei Andrei
      Corovei Andrei about 12 years
      @Joe I updated my post with some code.
    • Dan Puzey
      Dan Puzey about 12 years
      Offtopic, but Application.DoEvents is the devil (use a background thread instead), and I don't think your catch clause will work, because you'll be out of memory. It will, however, catch any other exception you happen to throw...
    • Corovei Andrei
      Corovei Andrei about 12 years
      @Joe can you please detail your idea with copying between Streams?
    • Joe
      Joe about 12 years
      @CoroveiAndrei: In .NET 4, streams have a .CopyTo method, which takes a destination stream. So you open the read stream, open the write stream, then call .CopyTo to copy the data from source->dest. It's about 10 lines of code, including the using blocks.