Array of zero length

29,635

Solution 1

Yes this is a C-Hack.
To create an array of any length:

struct someData* mallocSomeData(int size)
{
    struct someData*  result = (struct someData*)malloc(sizeof(struct someData) + size * sizeof(BYTE));
    if (result)
    {    result->nData = size;
    }
    return result;
}

Now you have an object of someData with an array of a specified length.

Solution 2

There are, unfortunately, several reasons why you would declare a zero length array at the end of a structure. It essentially gives you the ability to have a variable length structure returned from an API.

Raymond Chen did an excellent blog post on the subject. I suggest you take a look at this post because it likely contains the answer you want.

Note in his post, it deals with arrays of size 1 instead of 0. This is the case because zero length arrays are a more recent entry into the standards. His post should still apply to your problem.

http://blogs.msdn.com/oldnewthing/archive/2004/08/26/220873.aspx

EDIT

Note: Even though Raymond's post says 0 length arrays are legal in C99 they are in fact still not legal in C99. Instead of a 0 length array here you should be using a length 1 array

Solution 3

This is an old C hack to allow a flexible sized arrays.

In C99 standard this is not neccessary as it supports the arr[] syntax.

Solution 4

Your intution about "why not use an array of size 1" is spot on.

The code is doing the "C struct hack" wrong, because declarations of zero length arrays are a constraint violation. This means that a compiler can reject your hack right off the bat at compile time with a diagnostic message that stops the translation.

If we want to perpetrate a hack, we must sneak it past the compiler.

The right way to do the "C struct hack" (which is compatible with C dialects going back to 1989 ANSI C, and probably much earlier) is to use a perfectly valid array of size 1:

struct someData
{
   int nData;
   unsigned char byData[1];
}

Moreover, instead of sizeof struct someData, the size of the part before byData is calculated using:

offsetof(struct someData, byData);

To allocate a struct someData with space for 42 bytes in byData, we would then use:

struct someData *psd = (struct someData *) malloc(offsetof(struct someData, byData) + 42);

Note that this offsetof calculation is in fact the correct calculation even in the case of the array size being zero. You see, sizeof the whole structure can include padding. For instance, if we have something like this:

struct hack {
  unsigned long ul;
  char c;
  char foo[0]; /* assuming our compiler accepts this nonsense */
};

The size of struct hack is quite possibly padded for alignment because of the ul member. If unsigned long is four bytes wide, then quite possibly sizeof (struct hack) is 8, whereas offsetof(struct hack, foo) is almost certainly 5. The offsetof method is the way to get the accurate size of the preceding part of the struct just before the array.

So that would be the way to refactor the code: make it conform to the classic, highly portable struct hack.

Why not use a pointer? Because a pointer occupies extra space and has to be initialized.

There are other good reasons not to use a pointer, namely that a pointer requires an address space in order to be meaningful. The struct hack is externalizeable: that is to say, there are situations in which such a layout conforms to external storage such as areas of files, packets or shared memory, in which you do not want pointers because they are not meaningful.

Several years ago, I used the struct hack in a shared memory message passing interface between kernel and user space. I didn't want pointers there, because they would have been meaningful only to the original address space of the process generating a message. The kernel part of the software had a view to the memory using its own mapping at a different address, and so everything was based on offset calculations.

Solution 5

It's worth pointing out IMO the best way to do the size calculation, which is used in the Raymond Chen article linked above.

struct foo
{
    size_t count;
    int data[1];
}

size_t foo_size_from_count(size_t count)
{
    return offsetof(foo, data[count]);
}

The offset of the first entry off the end of desired allocation, is also the size of the desired allocation. IMO it's an extremely elegant way of doing the size calculation. It does not matter what the element type of the variable size array is. The offsetof (or FIELD_OFFSET or UFIELD_OFFSET in Windows) is always written the same way. No sizeof() expressions to accidentally mess up.

Share:
29,635
bgee
Author by

bgee

Software Developer of wide variety of Windows\Web\Database applications. Striving to provide clients with professional service including building of robust design leading to development of working, well tested applications in acceptable time frame and high quality maintenance. Specialties C++ under Windows API with MFC, STL, COM and Multithreading C# (.NET from 1.1 upto 4.0) OOD\OOP with UML and Design Patterns Client\Server based applications .NET (upto 3.5) with C#(WinForms), XML, XSLT Databases (Transact-SQL upto MS SQL Server 2005), Web - DHTML(JS), CSS, ASP(.NET), WebForms, Have occassional experience in Linux\Perl\Python

Updated on July 09, 2022

Comments

  • bgee
    bgee almost 2 years

    I am working on refactoring some old code and have found few structs containing zero length arrays (below). Warnings depressed by pragma, of course, but I've failed to create by "new" structures containing such structures (error 2233). Array 'byData' used as pointer, but why not to use pointer instead? or array of length 1? And of course, no comments were added to make me enjoy the process... Any causes to use such thing? Any advice in refactoring those?

    struct someData
    {
       int nData;
       BYTE byData[0];
    }
    

    NB It's C++, Windows XP, VS 2003

  • Alex B
    Alex B over 15 years
    Sadly, Visual Studio is very poor when it comes to C99 support. :(
  • Cheeso
    Cheeso over 14 years
    Without addressing the general truth of your comment, ...the MS VC v9 compiler supports the arr[] syntax.
  • ildjarn
    ildjarn about 12 years
    "This is the case because zero length arrays are a more recent entry into the standards." Which standards? C++11 still disallows 0-length arrays (§8.3.4/1), as well as C99 (§6.7.5.2/1).
  • JaredPar
    JaredPar about 12 years
    @ildjarn i was essentially parotting what Raymond said at the end of his blog post. I wasn't aware that 0 length arrays were still illegal in C99 until a recent comment discussion with you on another question. I'll update the answer
  • ildjarn
    ildjarn about 12 years
    Sorry to nitpick such an old answer. :-P I only ask because another question linked here as "proof" that 0-length arrays were legal C++. :-]
  • JaredPar
    JaredPar about 12 years
    @ildjarn NP on nitpicking old answers. Definitely don't want to be spouting bad data :)
  • Spidey
    Spidey almost 12 years
    Can you find ANY reference on the impacts of using 0-length or [] arrays on memory alignment? In a project, a colleague has found out that the best use to be safe would be int arr[] (since int protects from any alignment issues), but since we are returning the array, the very best in our case is void *arr[], which is pretty cryptographic.
  • unwind
    unwind about 10 years
    Shouldn't this at least use new[], this being about C++?
  • Martin York
    Martin York over 9 years
    @unwind: Can't use new for this. The whole point is that this is a C-Hack and not required in C++ (because we have better ways of doing it). Also I am pretty sure that zero length arrays are illegal in C++ (well at least C++03, not sure if that was updated in C++11).
  • Krishna Oza
    Krishna Oza over 9 years
    The popular jargon for this is "Struct Hack".
  • M.M
    M.M over 8 years
    "which is compatible with C dialects going back to 1989 " - accessing past the first element of the array causes undefined behaviour even in C89. The struct hack relies on the compiler "defining" this behaviour for itself.
  • IInspectable
    IInspectable almost 8 years
    Except, your calculation is off (in the general case). Depending on the type of objects in the array, a compiler needs to impose certain alignment rules, and summing up the sizes of the members may not produce the correct size. Instead, use the offsetof macro to have the compiler calculate the correct result. (Note: Not an issue for BYTEs, assuming those are defined to be some char variant.)
  • iamkroot
    iamkroot almost 4 years
    Here's the wayback archive of the blog in case anyone needs it (the original page is dead).