What does the ".align" x86 Assembler directive do exactly?

15,091

Solution 1

As mentioned in the comments, it means the compiler will add enough padding bytes so the next data lands on an "even" position (divisible by the alignment value). This is important because aligned memory access is much faster than unaligned memory access. (Loading a doubleword from 0x10000 is better than loading a doubleword from 0x10001). It might also be useful in case you are interfacing with other components and need to send/receive structs of data with a given padding/alignment.

Solution 2

First, note that .align it is not a x86 specific concept, but a GNU GAS directive documented here. It can also be used for other architectures. x86 does not specify directives, only instructions.

Now let's play with it to understand it:

a.S

.byte 1
.align 16
sym: .byte 2

Compile and decompile:

as -o a.o a.S
objdump -Sd a.o

Output:

0000000000000000 <a-0x10>:
   0:   01 0f                   add    %ecx,(%rdi)
   2:   1f                      (bad)  
   3:   44 00 00                add    %r8b,(%rax)
   6:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
   d:   00 00 00 

0000000000000010 <sym>:
  10:   02                      .byte 0x2

So sym was moved to byte 16, the first multiple of 16 after the first .byte 1 we've placed, to align it at 16 bytes.

The bytes used to fill between 01 and 02 are trash chosen by GAS (TODO how?)

Not let's try a different input:

.skip 5
.align 4
sym: .byte 2

Gives:

0000000000000000 <sym-0x8>:
   0:   00 00                   add    %al,(%rax)
   2:   00 00                   add    %al,(%rax)
   4:   00 0f                   add    %cl,(%rdi)
   6:   1f                      (bad)  
    ...

0000000000000008 <sym>:
   8:   02                      .byte 0x2

So this time sym was moved to 8, which is the first multiple of 4 that comes after 5.

Solution 3

The main reason for the align directive is to speed up execution. If a call or jmp target is at an odd address, it may need extra bus transfers and/or an advance to the exact byte. The same is for data. In the old 80386 manual there were penalties for certain opcodes, when the target was misaligned.

I found it in the manual (from http://css.csail.mit.edu/6.858/2011/readings/i386.pdf‎) on page 24:

Such misaligned data transfers reduce performance by requiring extra memory
cycles. For maximum performance, data structures (including stacks) should
be designed in such a way that, whenever possible, word operands are aligned
at even addresses and doubleword operands are aligned at addresses evenly
divisible by four. Due to instruction prefetching and queuing within the
CPU, there is no requirement for instructions to be aligned on word or
doubleword boundaries. (However, a slight increase in speed results if the
target addresses of control transfers are evenly divisible by four.)
Share:
15,091

Related videos on Youtube

Sinister Clock
Author by

Sinister Clock

Updated on July 03, 2022

Comments

  • Sinister Clock
    Sinister Clock almost 2 years

    I will list exactly what I do not understand, and show you the parts I can not understand as well.

    First off,

    The .Align Directive

    1. .align integer, pad. The .align directive causes the next data generated to be aligned modulo integer bytes

    1.~ ? : What is implied with "causes the next data generated to be aligned modulo integer bytes?" I can surmise that the next data generated is a memory-to-register transfer, no? Modulo would imply the remainder of a division. I do not understand "to be aligned modulo integer bytes".......

    What would be a remainder of a simple data declaration, and how would the next data generated being aligned by a remainder be useful? If the next data is aligned modulo, that is saying the next generated data, whatever that means exactly, is the remainder of an integer? That makes absolutely no sense.

    What specifically would the .align, say, .align 8 directive issued in x86 for a data byte compiled from a C char, i.e., char CHARACTER = 0; be for? Or specifically coded directly with that directive, not preliminary Assembly code after compiling C? I have debugged in Assembly and noticed that any C/C++ data declarations, like chars, ints, floats, etc. will insert the directive .align 8 to each of them, and add other directives like .bss, .zero, .globl, .text, .Letext0, .Ltext0.

    What are all of these directives for, or at least my main asking? I have learned a lot of the main x86 Assembly instructions, but never was introduced or pointed at all of these strange directives. How do they affect the opcodes, and are all of them necessary?

    • microtherion
      microtherion about 11 years
      It just means that the assembler will place the next byte at an address evenly divisible by integer, so if e.g. the last byte was placed at 0x0eda, then ordinarily, the next byte would be placed at 0x0edb, but with an .align 8 directive in place, the next byte would be placed at 0x0ec0, the next address that is evenly divisible by 8
  • Peter Cordes
    Peter Cordes about 4 years
    I think there are some ISAs where .align is a synonym for .p2align (power of 2), not .balign (byte). On x86 it's .balign, like the align directives in most other assemblers like MASM and NASM.
  • Peter Cordes
    Peter Cordes about 4 years
    Loading a doubleword from 0x10000 is better than loading a doubleword from 0x10001 A better example would be a cache-line or page split, like 0xffff is much worse than 0x10000 because that's true on all CPUs. Misalignment within a cache line (or within a 16-byte chunk of a cache line) has literally zero extra cost in a lot of cases on most modern x86 CPUs, assuming normal (cacheable) memory.
  • Peter Cordes
    Peter Cordes over 3 years
    would you mind taking Trump out of your username? SO usernames aren't a great place for random political statements, especially ones unrelated to SO management. (And Trump is now a private citizen so there are fewer grounds for arguing the ban should be reversed at this point, only that it shouldn't have happened in the first place several months ago.)
  • Ciro Santilli OurBigBook.com
    Ciro Santilli OurBigBook.com over 3 years
    @PeterCordes hi, related meta threads at: cirosantilli.com/china-dictatorship/#stack-overflow please report to a mod or create a new thread and ping me
  • Peter Cordes
    Peter Cordes over 3 years
    Ah, fair enough, I stand corrected. They are allowed. I still wouldn't like to see everyone's username turn into a political statement, even if they were all ones I agreed with, though. (would you? Perhaps you would.) So I maintain it's still in somewhat poor taste and something you personally might want to consider voluntarily changing at this point, if you agree.
  • Peter Cordes
    Peter Cordes over 3 years
    Oh and BTW, Segfault with RIP-relative addressing on Linux reports that .align on x86-64 MacOS/clang is a synonym for .p2align, not .balign. So it's not even portable between different x86 systems in GAS syntax and should never be used.
  • Ciro Santilli OurBigBook.com
    Ciro Santilli OurBigBook.com over 3 years
    @PeterCordes I am a huge supporter of freedom of speech, and that people should be able to say whatever they want on their personal profiles, as long as it is legal in the jurisdiction where Stack Overflows servers are located. Thanks for the MacOS note.