C/C++: Force Bit Field Order and Alignment

82,555

Solution 1

No, it will not be fully-portable. Packing options for structs are extensions, and are themselves not fully portable. In addition to that, C99 §6.7.2.1, paragraph 10 says: "The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined."

Even a single compiler might lay the bit field out differently depending on the endianness of the target platform, for example.

Solution 2

Bit fields vary widely from compiler to compiler, sorry.

With GCC, big endian machines lay out the bits big end first and little endian machines lay out the bits little end first.

K&R says "Adjacent [bit-]field members of structures are packed into implementation-dependent storage units in an implementation-dependent direction. When a field following another field will not fit ... it may be split between units or the unit may be padded. An unnamed field of width 0 forces this padding..."

Therefore, if you need machine independent binary layout you must do it yourself.

This last statement also applies to non-bitfields due to padding -- however all compilers seem to have some way of forcing byte packing of a structure, as I see you already discovered for GCC.

Solution 3

Bitfields should be avoided - they aren't very portable between compilers even for the same platform. from the C99 standard 6.7.2.1/10 - "Structure and union specifiers" (there's similar wording in the C90 standard):

An implementation may allocate any addressable storage unit large enough to hold a bitfield. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

You cannot guarantee whether a bit field will 'span' an int boundary or not and you can't specify whether a bitfield starts at the low-end of the int or the high end of the int (this is independant of whether the processor is big-endian or little-endian).

Prefer bitmasks. Use inlines (or even macros) to set, clear and test the bits.

Solution 4

endianness are talking about byte orders not bit orders. Nowadays , it is 99% sure that bit orders are fixed. However, when using bitfields, endianness should be taken in count. See the example below.

#include <stdio.h>

typedef struct tagT{

    int a:4;
    int b:4;
    int c:8;
    int d:16;
}T;


int main()
{
    char data[]={0x12,0x34,0x56,0x78};
    T *t = (T*)data;
    printf("a =0x%x\n" ,t->a);
    printf("b =0x%x\n" ,t->b);
    printf("c =0x%x\n" ,t->c);
    printf("d =0x%x\n" ,t->d);

    return 0;
}

//- big endian :  mips24k-linux-gcc (GCC) 4.2.3 - big endian
a =0x1
b =0x2
c =0x34
d =0x5678
 1   2   3   4   5   6   7   8
\_/ \_/ \_____/ \_____________/
 a   b     c           d

// - little endian : gcc (Ubuntu 4.3.2-1ubuntu11) 4.3.2
a =0x2
b =0x1
c =0x34
d =0x7856
 7   8   5   6   3   4   1   2
\_____________/ \_____/ \_/ \_/
       d           c     b   a

Solution 5

Most of the time, probably, but don't bet the farm on it, because if you're wrong, you'll lose big.

If you really, really need to have identical binary information, you'll need to create bitfields with bitmasks - e.g. you use an unsigned short (16 bit) for Message, and then make things like versionMask = 0xE000 to represent the three topmost bits.

There's a similar problem with alignment within structs. For instance, Sparc, PowerPC, and 680x0 CPUs are all big-endian, and the common default for Sparc and PowerPC compilers is to align struct members on 4-byte boundaries. However, one compiler I used for 680x0 only aligned on 2-byte boundaries - and there was no option to change the alignment!

So for some structs, the sizes on Sparc and PowerPC are identical, but smaller on 680x0, and some of the members are in different memory offsets within the struct.

This was a problem with one project I worked on, because a server process running on Sparc would query a client and find out it was big-endian, and assume it could just squirt binary structs out on the network and the client could cope. And that worked fine on PowerPC clients, and crashed big-time on 680x0 clients. I didn't write the code, and it took quite a while to find the problem. But it was easy to fix once I did.

Share:
82,555
TNC
Author by

TNC

Updated on July 08, 2022

Comments

  • TNC
    TNC almost 2 years

    I read that the order of bit fields within a struct is platform specific. What about if I use different compiler-specific packing options, will this guarantee data is stored in the proper order as they are written? For example:

    struct Message
    {
      unsigned int version : 3;
      unsigned int type : 1;
      unsigned int id : 5;
      unsigned int data : 6;
    } __attribute__ ((__packed__));
    

    On an Intel processor with the GCC compiler, the fields were laid out in memory as they are shown. Message.version was the first 3 bits in the buffer, and Message.type followed. If I find equivalent struct packing options for various compilers, will this be cross-platform?

  • Windows programmer
    Windows programmer over 14 years
    The output of a and b indicates that endianness is still talking about bit orders AND byte orders.
  • trondd
    trondd over 12 years
    I think it is wrong to state that it is stupid to use bit fields since it provide a very clean way to represent hardware registers, which it was created to model, in C.
  • Ben Voigt
    Ben Voigt over 11 years
    @trondd: No, they were created to save memory. Bitfields aren't intended to map to outside data structures, such as memory-mapped hardware registers, network protocols, or file formats. If they were intended to map to outside data structures, the packing order would have been standardized.
  • Greg A. Woods
    Greg A. Woods almost 11 years
    The order of bitfields can be determined at compile time.
  • Greg A. Woods
    Greg A. Woods almost 11 years
    Also, bitfields are highly preferred when dealing with bit flags that have no external representation outside the program (i.e. on disk or in registers or in memory accessed by other programs, etc).
  • mozzbozz
    mozzbozz over 9 years
    @GregA.Woods: If this really is the case, please provide an answer describing how. I could not find anything but your comment when googling for it...
  • mozzbozz
    mozzbozz over 9 years
    @GregA.Woods: Sorry, should have written to which comment I referred. I meant: You say that "The order of bitfields can be determined at compile time.". I cannot anything about it and how to do it.
  • Greg A. Woods
    Greg A. Woods over 9 years
    @mozzbozz Have a look at planix.com/~woods/projects/wsg2000.c and search for definitions and use of _BIT_FIELDS_LTOH and _BIT_FIELDS_HTOL
  • johnnycrash
    johnnycrash almost 9 years
    Using bits saves memory. Using bit fields increases readability. Using less memory is faster. Using bits allows for more complex atomic operations. In out applications in the real world, there is need for performance and complex atomic operations. This answer wouldn't work for us.
  • underscore_d
    underscore_d over 8 years
    Is K&R really considered a useful reference, given that it was pre-standardisation and has (I assume?) probably been superseded in many areas?
  • underscore_d
    underscore_d over 8 years
    Yeah, the GCC, for instance, specifically notes that bitfields are arranged as per the ABI, not the implementation. So, just staying on a single compiler is not sufficient to guarantee ordering. The architecture has to be checked, too. A bit of a nightmare for portability, really.
  • underscore_d
    underscore_d over 8 years
    @BenVoigt probably true, but if a programmer is willing to confirm that the ordering of their compiler/ABI matches what they need, and sacrifice quick portability accordingly - then they certainly can fulfil that role. As for 9*, which authoritative mass of "real world coders" consider all use of bitfields to be "unprofessional/lazy/stupid" and where did they state this?
  • vpalmu
    vpalmu over 8 years
    My K&R is post-ANSI.
  • underscore_d
    underscore_d over 8 years
    Now that is embarrassing: I didn't realise they'd released a post-ANSI revision. My bad!
  • Aaron Campbell
    Aaron Campbell about 8 years
    Why didn't the C standard guarantee an order for bit fields?
  • Stephen Canon
    Stephen Canon about 8 years
    It's difficult to consistently and portably define "order" of bits within bytes, much less the order of bits that may cross byte boundaries. Any definition the you settle on will fail to match a considerable amount of existing practice.
  • peterchen
    peterchen over 7 years
    implementaiton-defined allows for platform-specific optimization. On some platforms, padding between the bit fields can improve access, imagine four seven-bit fields in a 32 bit int: aligning them at every 8th bit is a significant improvement for platforms that have byte reads.
  • Jonathan
    Jonathan over 7 years
    wonderful example with bit ordering and byte ordering problematics
  • Ciro Santilli OurBigBook.com
    Ciro Santilli OurBigBook.com almost 7 years
  • Krauss
    Krauss over 6 years
    Did you actually compile and run the code? The values for "a" and "b" don't seem logical to me: you are basically saying that the compiler will swap the nibbles within a byte because of endianness. In the case of "d", endiannes should not affect the byte order within char arrays (assuming char is 1 byte long); if the compiler did that, we wouldn't be able to iterate through an array using pointers. If, on the other hand you had used an array of two 16 bit integers e.g.: uint16 data[]={0x1234,0x5678}; then d would definitely be 0x7856 in little endian systems.
  • Dave Newton
    Dave Newton over 5 years
    Using less memory is not always faster; it is often more efficient to use more memory and reduce post-read operations, and the processor/processor mode can make that even more true.
  • user5534993
    user5534993 almost 4 years
    As the question is about C++ as well: For example the C++17 standard does state in section 12.2.4 class.bit §1: "Bit-fields are assigned right-to-left on some machines, left-to-right on others."
  • kbro
    kbro almost 3 years
    @GregA.Woods that file provides a very good example of why you shouldn't try to use bitfields portably - it's horrendous!
  • kbro
    kbro almost 3 years
    if the standard says "implementation-defined" then all bets are off.
  • Greg A. Woods
    Greg A. Woods almost 3 years
    The standard could have made things trivial requiring such defines by default, but of course committees.... Meanwhile have a look at any sufficiently "modern" collection of system headers and I'm sure you'll see far worse.