Read line from byte array (not convert byte array to string)

18,238

Solution 1

Personally I would

  1. Put an Int16 at the start of the strings, so you know how long they're going to be, and
  2. Use the IO.BinaryReader class to do the reading, it'll "read", ints, strings, chars etc into variable e.g. BinReader.ReadInt16() will read two bytes, return the int16 they represent, and move two bytes on in the stream

Hope this helps.

P.S. Be careful using the ReadString method, it assumes the string is prepended with custom 7 bit integers i.e. that it was written by the BinaryWriter class. The following is from this CodeGuru post

The BinaryWriter class has two methods for writing strings: the overloaded Write() method and the WriteString() method. The former writes the string as a stream of bytes according to the encoding the class is using. The WriteString() method also uses the specified encoding, but it prefixes the string's stream of bytes with the actual length of the string. Such prefixed strings are read back in via BinaryReader.ReadString().

The interesting thing about the length value it that as few bytes as possible are used to hold this size, it is stored as a type called a 7-bit encoded integer. If the length fits in 7 bits a single byte is used, if it is greater than this then the high bit on the first byte is set and a second byte is created by shifting the value by 7 bits. This is repeated with successive bytes until there are enough bytes to hold the value. This mechanism is used to make sure that the length does not become a significant portion of the size taken up by the serialized string. BinaryWriter and BinaryReader have methods to read and write 7-bit encoded integers, but they are protected and so you can use them only if you derive from these classes.

Solution 2

I would go with length-prefixed strings. It will make your life a lot simpler, and it means you can represent strings with line breaks in. A few comments on your code though:

  • Don't use Stream.DataAvailable. Just because there's not data available now doesn't mean you've read the end of the stream.
  • Unless you're absolutely sure you'll never need text beyond ASCII, don't use ASCIIEncoding.
  • Don't assume that Stream.Read will read all the data you ask it to. Always check the return value.
  • BinaryReader makes a lot of this a lot easier (including length-prefixed strings and a Read that loops until it's read what you've asked it to)
  • You're calling BitConverter.ToUInt16 twice on the same data. Why?
Share:
18,238
user3438601
Author by

user3438601

I make things from ones and zeroes. I make "that's what she said" jokes.

Updated on June 13, 2022

Comments

  • user3438601
    user3438601 almost 2 years

    I have a byte array that I am reading in from a NetworkStream. The first two bytes tell the length of the packet that follows and then the packet is read into a byte array of that length. The data in that I need to read from the NetworkStream/byte array has a few Strings, i.e. variable length data terminated by new line characters, and some fixed width fields like bytes and longs. So, something like this:

    // I would have delimited these for clarity but I didn't want
    // to imply that the stream was delimited because it's not.
    StringbyteStringStringbytebytebytelonglongbytelonglong
    

    I know (and have some say in) the format of the data packet that is coming across, and what I need to do is read a "line" for each string value, but read a fixed number of bytes for the bytes and longs. So far, my proposed solution is to use a while loop to read bytes into a temp byte array until there is a newline character. Then, convert the bytes to a string. This seems kludgy to me, but I don't see another obvious way. I realize I could use StreamReader.ReadLine() but that would involve another stream and I already have a NetworkStream. But if that's the better solution, I'll give it a shot.

    The other option I have considered is to have my backend team write a byte or two for those String values' lengths so I can read the length and then read the String based on the length specified.

    So, as you can see, I have some options for how to go about this, and I'd like your input about what you would consider the best way to do it. Here's the code that I have right now for reading in the entire packet as a string. The next step is to break out the various fields of the packet and do the actual programming work that needs to be done, creating objects, updating UI, etc. based on the data in the packet.

    string line = null;  
    while (stream.DataAvailable)
    {  
        //Get the packet length;  
        UInt16 packetLength = 0;  
        header = new byte[2];  
        stream.Read(header, 0, 2);  
        // Need to reverse the header array for BitConverter class if architecture is little endian.  
        if (BitConverter.IsLittleEndian)
            Array.Reverse(header);  
        packetLength = BitConverter.ToUInt16(header,0);
    
        buffer = new byte[packetLength];
        stream.Read(buffer, 0, BitConverter.ToUInt16(header, 0));
        line = System.Text.ASCIIEncoding.ASCII.GetString(buffer);
        Console.WriteLine(line);
    }
    
  • user3438601
    user3438601 about 15 years
    Hadn't come across BinaryReader before. That looks like it will be immensely helpful for all this stuff I'm working on. Thanks!
  • user3438601
    user3438601 about 15 years
    Yah, the BinaryReader didn't work for us because the data is being written to the thread by a Java app. I ended up "brute forcing" it by reading in the byte array and then cursing through it based on our data packet structure.