C#, read structures from binary file

11,937

Solution 1

It's possible to do something similar in C#, but then you would have to apply a lot of attributes to a structure so that you control exactly how it's laid out in memory. By default the JIT compiler controls how structure members are laid out in memory, which usually means that they are rearranged and padded for the most efficient layout considering speed and memory usage.

The simplest way is usually to use the BinaryReader to read the separate members of the structure in the file, and put the values in properties in a class, i.e. manually deserialise the data into a class instance.

Normally it's reading the file that is the bottle neck in this operation, so the small overhead of reading the separate members doesn't affect the performance noticeably.

Solution 2

public static class StreamExtensions
{
    public static T ReadStruct<T>(this Stream stream) where T : struct
    {
        var sz = Marshal.SizeOf(typeof(T));
        var buffer = new byte[sz];
        stream.Read(buffer, 0, sz);
        var pinnedBuffer = GCHandle.Alloc(buffer, GCHandleType.Pinned);
        var structure = (T) Marshal.PtrToStructure(
            pinnedBuffer.AddrOfPinnedObject(), typeof(T));
        pinnedBuffer.Free();
        return structure;
    }
}

You need to ensure your struct is declared with [StructLayout] and possibly [FieldOffset] annotations to match the binary layout in the file

EDIT:

Usage:

SomeStruct s = stream.ReadStruct<SomeStruct>();

Solution 3

Here is a slightly modified version of Jesper's code:

public static T? ReadStructure<T>(this Stream stream) where T : struct
{
    if (stream == null)
        return null;

    int size = Marshal.SizeOf(typeof(T));
    byte[] bytes = new byte[size];
    if (stream.Read(bytes, 0, size) != size) // can't build this structure!
        return null;

    GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    try
    {
        return (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
    }
    finally
    {
        handle.Free();
    }
}

It handles EOF cases successfully as it returns a nullable type.

Solution 4

Just to elaborate on Guffa's and jesperll's answer, here a sample on reading in the file header for a ASF (WMV/WMA) file using basically the same ReadStruct method (just not as extension method)

MemoryStream ms = new MemoryStream(headerData);
AsfFileHeader asfFileHeader = ReadStruct<AsfFileHeader>(ms);


[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Pack = 1)]
internal struct AsfFileHeader
{
    [MarshalAs(UnmanagedType.ByValArray, SizeConst = 16)] 
    public byte[] object_id;
    public UInt64 object_size;
    public UInt32 header_object_count;
    public byte r1;
    public byte r2;
}

Solution 5

There is no similar way in C#. Moreover, this is deprecated way of serializing due to its non-portability. Use http://www.codeproject.com/KB/cs/objserial.aspx instead.

Share:
11,937
B_old
Author by

B_old

Updated on July 12, 2022

Comments

  • B_old
    B_old almost 2 years

    I want to read structures from binary. In C++ I would do it like this:

    stream.read((char*)&someStruct, sizeof(someStruct));
    

    Is there a similar way in C#? The BinaryReader only works for built-in types. In .NET 4 there is a MemoryMappedViewAccessor. It provides methods like Read<T> which seems to be what I want, except that I manually have to keep track of where in the file I want to read. Is there a better way?

  • casperOne
    casperOne over 13 years
    @jesperll: That's a really bad idea, especially if the structure is not flat. if there are pointers anywhere in the structure then that referenced structure/class will not be written to the output. Even worse, when read back in, it will point to an invalid memory space.
  • Jesper Larsen-Ledet
    Jesper Larsen-Ledet over 13 years
    True. You get into problems if you have stuff like arrays in your struct since they're not value types
  • Jesper Larsen-Ledet
    Jesper Larsen-Ledet over 13 years
    But if you're using it to parse a file format where much of it is header blocks with simple types then it's quite doable
  • BrokenGlass
    BrokenGlass over 13 years
    +1 this is perfectly viable, I'm using pretty much the same code for reading in binary structures from ASF files - @casperOne I don't think the question was asking for complex object serialization/deserialization mechanisms
  • B_old
    B_old over 13 years
    In which cases is it not portable? I'm using that c++ code to read the same data in both x86 and x64 and seems to work fine.
  • B_old
    B_old over 13 years
    Sounds reasonable. Performance is not the main issue here, I just thought it is a bit inconvenient.
  • Lavir the Whiolet
    Lavir the Whiolet over 13 years
    If you write the data in one platform (x86, for example) and read in another (64) then you may get problems.
  • B_old
    B_old over 13 years
    Thinking about it a little more, I don't want to use a loop, just to read an array of something.
  • Guffa
    Guffa over 13 years
    @B_old: It's a lot easier to write the few lines of code to read the value one at a time, than to get the attributes right for all members of a structure so that it's guaranteed to be laid out exactly in memory as the file is arranged. You won't get away from using a loop in some form whatever solution you choose.
  • B_old
    B_old over 13 years
    That is exactly what I don't understand, because I'm doing it. Do you maybe have a link to something explaining the issue in a little more detail?
  • Lavir the Whiolet
    Lavir the Whiolet over 13 years
    No I don't have a link to comprehensive explanation, sorry. But I know that memory layout differs in different platforms and even when different compiler arguments were used. There is no exact standard that "fields in memory must appear in the same order as they appear in source code" or "long is represented by 32 bit on all platforms" or "fields are aligned in 32 bit packets everywhere" and there can not be such a standard. Successful using of that C++ code in x86 and in x64 means that you are just lucky. Try to play with compiler keys or try to compile for ARM.
  • poy
    poy about 11 years
    It tends to be more of a compiler implementation rather than x86 vs x64.
  • Carl Walsh
    Carl Walsh over 7 years
    Very cool! Marshal.PtrToStructure() throws on enum types (possibly because you can't use [StructLayout] on enum?). For enums you can use typeof(T).GetEnumUnderlyingType() and it works.