C# StreamReader, "ReadLine" For Custom Delimiters

12,097

Solution 1

I figured I would post my own solution. It seems to work pretty well and the code is relatively simple. Feel free to comment.

public static String ReadUntil(this StreamReader sr, String delim)
{
    StringBuilder sb = new StringBuilder();
    bool found = false;

    while (!found && !sr.EndOfStream)
    {
       for (int i = 0; i < delim.Length; i++)
       {
           Char c = (char)sr.Read();
           sb.Append(c);

           if (c != delim[i])
               break;

           if (i == delim.Length - 1)
           {
               sb.Remove(sb.Length - delim.Length, delim.Length);
               found = true;
           }
        }
     }

     return sb.ToString();
}

Solution 2

This code should work for any string separator.

public static IEnumerable<string> ReadChunks(this TextReader reader, string chunkSep)
{
    var sb = new StringBuilder();

    var sepbuffer = new Queue<char>(chunkSep.Length);
    var sepArray = chunkSep.ToCharArray();

    while (reader.Peek() >= 0)
    {
        var nextChar = (char)reader.Read();
        if (nextChar == chunkSep[sepbuffer.Count])
        {
            sepbuffer.Enqueue(nextChar);
            if (sepbuffer.Count == chunkSep.Length)
            {
                yield return sb.ToString();
                sb.Length = 0;
                sepbuffer.Clear();
            }
        }
        else
        {
            sepbuffer.Enqueue(nextChar);
            while (sepbuffer.Count > 0)
            {
                sb.Append(sepbuffer.Dequeue());
                if (sepbuffer.SequenceEqual(chunkSep.Take(sepbuffer.Count)))
                    break;
            }
        }
    }
    yield return sb.ToString() + new string(sepbuffer.ToArray());
}

Disclaimer:

I made a little testing on this and is actually slower than ReadLine method, but I suspect it is due to the enqueue/dequeue/sequenceEqual calls that in the ReadLine method can be avoided (because the separator is always \r\n).

Again, I made few tests and it should work, but don't take it as perfect, and feel free to correct it. ;)

Solution 3

Here is a simple parser I used where needed (usually if streaming is not a paramount just read and .Split does the job), not too optimized but should work fine:
(it's more of a Split like method - and more notes below)

    public static IEnumerable<string> Split(this Stream stream, string delimiter, StringSplitOptions options)
    {
        var buffer = new char[_bufffer_len];
        StringBuilder output = new StringBuilder();
        int read;
        using (var reader = new StreamReader(stream))
        {
            do
            {
                read = reader.ReadBlock(buffer, 0, buffer.Length);
                output.Append(buffer, 0, read);

                var text = output.ToString();
                int id = 0, total = 0;
                while ((id = text.IndexOf(delimiter, id)) >= 0)
                {
                    var line = text.Substring(total, id - total);
                    id += delimiter.Length;
                    if (options != StringSplitOptions.RemoveEmptyEntries || line != string.Empty)
                        yield return line;
                    total = id;
                }
                output.Remove(0, total);
            }
            while (read == buffer.Length);
        }

        if (options != StringSplitOptions.RemoveEmptyEntries || output.Length > 0)
            yield return output.ToString();
    }

...and you can simply switch to char delimiters if needed just replace the

while ((id = text.IndexOf(delimiter, id)) >= 0)

...with

while ((id = text.IndexOfAny(delimiters, id)) >= 0)

(and id++ instead of id+= and a signature this Stream stream, StringSplitOptions options, params char[] delimiters)

...also removes empty etc.
hope it helps

Solution 4

    public static String ReadUntil(this StreamReader streamReader, String delimiter)
    {
        StringBuilder stringBuilder = new StringBuilder();

        while (!streamReader.EndOfStream)
        {
            stringBuilder.Append(value: (Char) streamReader.Read());

            if (stringBuilder.ToString().EndsWith(value: delimiter))
            {
                stringBuilder.Remove(stringBuilder.Length - delimiter.Length, delimiter.Length);
                break;
            }
        }

        return stringBuilder.ToString();
    }
Share:
12,097
Eric
Author by

Eric

Updated on June 05, 2022

Comments

  • Eric
    Eric almost 2 years

    What is the best way to have the functionality of the StreamReader.ReadLine() method, but with custom (String) delimiters?

    I'd like to do something like:

    String text;
    while((text = myStreamReader.ReadUntil("my_delim")) != null)
    {
       Console.WriteLine(text);
    }
    

    I attempted to make my own using Peek() and StringBuilder, but it's too inefficient. I'm looking for suggestions or possibly an open-source solution.

    Thanks.

    Edit

    I should have clarified this earlier...I have seen this answer, however, I'd prefer not to read the entire file into memory.

  • Jon Coombs
    Jon Coombs about 10 years
    It would be slightly clearer (to me) if you put a "break" right after "found = true" as well. Requires a little bit less mental processing.
  • Jirka Hanika
    Jirka Hanika almost 10 years
    This solution only works in some cases. For example, if the delimiter is "xy", then this algorithm will miss the delimiter in "axxyb" and it will read until the end of the stream.