What is a bus error? Is it different from a segmentation fault?

391,378

Solution 1

Bus errors are rare nowadays on x86 and occur when your processor cannot even attempt the memory access requested, typically:

  • using a processor instruction with an address that does not satisfy its alignment requirements.

Segmentation faults occur when accessing memory which does not belong to your process. They are very common and are typically the result of:

  • using a pointer to something that was deallocated.
  • using an uninitialized hence bogus pointer.
  • using a null pointer.
  • overflowing a buffer.

PS: To be more precise, it is not manipulating the pointer itself that will cause issues. It's accessing the memory it points to (dereferencing).

Solution 2

A segfault is accessing memory that you're not allowed to access. It's read-only, you don't have permission, etc...

A bus error is trying to access memory that can't possibly be there. You've used an address that's meaningless to the system, or the wrong kind of address for that operation.

Solution 3

mmap minimal POSIX 7 example

"Bus error" happens when the kernel sends SIGBUS to a process.

A minimal example that produces it because ftruncate was forgotten:

#include <fcntl.h> /* O_ constants */
#include <unistd.h> /* ftruncate */
#include <sys/mman.h> /* mmap */

int main() {
    int fd;
    int *map;
    int size = sizeof(int);
    char *name = "/a";

    shm_unlink(name);
    fd = shm_open(name, O_RDWR | O_CREAT, (mode_t)0600);
    /* THIS is the cause of the problem. */
    /*ftruncate(fd, size);*/
    map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    /* This is what generates the SIGBUS. */
    *map = 0;
}

Run with:

gcc -std=c99 main.c -lrt
./a.out

Tested in Ubuntu 14.04.

POSIX describes SIGBUS as:

Access to an undefined portion of a memory object.

The mmap spec says that:

References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal.

And shm_open says that it generates objects of size 0:

The shared memory object has a size of zero.

So at *map = 0 we are touching past the end of the allocated object.

Unaligned stack memory accesses in ARMv8 aarch64

This was mentioned at: What is a bus error? for SPARC, but here I will provide a more reproducible example.

All you need is a freestanding aarch64 program:

.global _start
_start:
asm_main_after_prologue:
    /* misalign the stack out of 16-bit boundary */
    add sp, sp, #-4
    /* access the stack */
    ldr w0, [sp]

    /* exit syscall in case SIGBUS does not happen */
    mov x0, 0
    mov x8, 93
    svc 0

That program then raises SIGBUS on Ubuntu 18.04 aarch64, Linux kernel 4.15.0 in a ThunderX2 server machine.

Unfortunately, I can't reproduce it on QEMU v4.0.0 user mode, I'm not sure why.

The fault appears to be optional and controlled by the SCTLR_ELx.SA and SCTLR_EL1.SA0 fields, I have summarized the related docs a bit further here.

Solution 4

I believe the kernel raises SIGBUS when an application exhibits data misalignment on the data bus. I think that since most[?] modern compilers for most processors pad / align the data for the programmers, the alignment troubles of yore (at least) mitigated, and hence one does not see SIGBUS too often these days (AFAIK).

From: Here

Solution 5

On POSIX systems, you can also get the SIGBUS signal when a code page cannot be paged in for some reason.

Share:
391,378

Related videos on Youtube

raldi
Author by

raldi

Updated on October 26, 2021

Comments

  • raldi
    raldi over 2 years

    What does the "bus error" message mean, and how does it differ from a segmentation fault?

    • xdevs23
      xdevs23 over 6 years
      I'd like to add a simple explanation for both: Segmentation fault means that you are trying to access memory that you are not allowed to (e. g. it's not part of your program). However, on a bus error it usually means that you are trying to access memory that does not exist (e. g. you try to access an address at 12G but you only have 8G memory) or if you exceed the limit of usable memory.
    • Peter Mortensen
      Peter Mortensen over 4 years
      On what platform did you see this? PC? Mac? x86? 32/64?
  • 11684
    11684 about 11 years
    They aren't rare; I'm just at Exercise 9 from How to Learn C the Hard Way and already encountered one...
  • dexterous
    dexterous over 10 years
    In case, I had data[8]; This is now a multiple of 4 in a 32-bit architecture. So, it is aligned. Will I still get the error now? Also, please explain, is it a bad idea to a data type conversion for pointers. Will it cause mis-alignment errors on a fragile architecture. Please elaborate, It will help me.
  • Calvin Huang
    Calvin Huang over 10 years
    My i7 certainly has an MMU, but I still came across this error while learning C on OS X (passing uninitialized pointer to scanf). Does that mean that OS X Mavericks is buggy? What would have been the behavior on a non-buggy OS?
  • Svartalf
    Svartalf over 9 years
    Heh. It's not so much type conversion as you're doing type conversion on a pointer that you've done pointer math on. Look carefully at the code above. The compiler has carefully dword aligned your pointer for data- and then you screw everything up on the compiler by offsetting the reference by TWO and typecasting to a very much needing to be dword aligned access on what's going to be a non-dword boundary.
  • Svartalf
    Svartalf over 9 years
    "Fragile" isn't the word I'd use for all of this. X86 machines and code have got people doing rather silly things for a while now, this being one of them. Rethink your code if you're having this sort of problem- it's not very performant on X86 to begin with.
  • Svartalf
    Svartalf over 9 years
    Heh...if this were the case, you'd have BUS error concerns instead of the stack smashing exploits you read about all the time for Windows and other machines. BUS errors are caused by an attempt to access "memory" that the machine simply cannot access because the address is invalid. (Hence the term "BUS" error.) This can be due to a host of failings, including invalid alignments, and the like- so long as the processor can't place the address ON the bus lines.
  • Svartalf
    Svartalf over 9 years
    Depends on the nasty tricks you're doing with your code. You can trigger a BUS error/Alignment Trap if you do something silly like do pointer math and then typecast for access to a problem mode (i.e. You set up an uint8_t array, add one, two, or three to the array's pointer and then typecast to a short, int, or long and try to access the offending result.) X86 systems will pretty much let you do this, albeit at a real performance penalty. SOME ARMv7 systems will let you do this- but most ARM, MIPS, Power, etc. will grouse at you over it.
  • supercat
    supercat about 9 years
    @Svartalf: On x86, word accesses on unaligned pointers are certainly slower than word accesses to aligned pointers, but at least historically they have been faster than simple code which unconditionally assembles things out of bytes, and they're certainly simpler than code which tries to use an optimal combination of varied-size operations. I wish the C standard would include means of packing/unpacking larger integer types to/from a sequence of smaller integers/characters so as to let the compiler use whatever approach is best on a given platform.
  • Svartalf
    Svartalf about 9 years
    @Supercat: The thing is this- you get away with it on X86. You try this on ARM, MIPS, Power, etc. and you're going to get nasty things happening to you. On ARM less than Arch V7, you will have your code have an alignment failure- and on V7, you can, IF your runtime is set for it, handle it with a SEVERE performance hit. You just simply don't want to DO this. It's bad practices, to be blunt. :D
  • supercat
    supercat about 9 years
    @Svartalf: I'm well aware that platforms vary as to their treatment of unaligned accesses. I wish the C standard would define a means via which source code could specify "I want this type to behave as a pointer with __ alignment, to an unsigned integer stored using the __ lower bits of each of __ locations of type __, in __-first format", and let the compiler generate whatever machine code would be needed to accomplish that. A compiler may end up having to generate nasty inefficient code, but it wouldn't be any worse than what a programmer would have to write for portability. The difference...
  • supercat
    supercat about 9 years
    ...would be that on platforms where the stated requirements coincide with a natural processor behavior, the compiler could exploit that easily. If on e.g. x86 the programmer had requested MSB-first alignment, the processor may have to add byte-swap instructions after loads and before stores, but that would still be cheaper than four separate 8-bit stores.
  • Eloff
    Eloff almost 9 years
    Another cause of bus errors (on Linux anyway) is when the operating system can't back a virtual page with physical memory (e.g. low-memory conditions or out of huge pages when using huge page memory.) Typically mmap (and malloc) just reserve the virtual address space, and the kernel assigns the physical memory on demand (so called soft page faults.) Make a large enough malloc, and then write to enough of it and you'll get a bus error.
  • poordeveloper
    poordeveloper almost 9 years
    This often happens when I update the .so file while running the process
  • vpalmu
    vpalmu almost 9 years
    Probably stack overflow protection raises bus error.
  • Mark Lakata
    Mark Lakata almost 8 years
    "foo" is stored in a read-only segment of memory, so it is impossible to write to it. It wouldn't be stack overflow protection, just memory write protection (this is a security hole if your program can rewrite itself).
  • ilija139
    ilija139 about 7 years
    Another reason to happen is if you try to mmap a file larger than the size of /dev/shm
  • c33s
    c33s about 7 years
    for me the partition containing /var/cache was simply full askubuntu.com/a/915520/493379
  • Christopher K.
    Christopher K. almost 6 years
    In my case, a method static_casted a void * parameter to an object that stores a callback (one attribute points to the object and the other to the method). Then the callback is called. However, what was passed as void * was something completely different and thus the method call caused the bus error.
  • Lewis Kelsey
    Lewis Kelsey about 5 years
    @bltxd Do you know the nature of bus errors. i.e. does the message on the ring bus have some mechanism where a stop on the ring also accepts a message that was sent by it but to whichever destination as it suggests that it has gone all the way round the ring and hasn't been accepted. I'm guessing the line fill buffer returns an error status and when it retires it flushes the pipeline and calls the correct exception microroutine. This basically requires that the memory controller accept all address in its range which would suggest that when the BARs etc are changed, it would have to internally
  • Lewis Kelsey
    Lewis Kelsey about 5 years
    change registers in the memory controller to exclude that address range
  • Lewis Kelsey
    Lewis Kelsey about 5 years
    It seems like an awkward operation to me as memory is intialised on boot before the BARs so any update to a BAR would have to reflect itself in the memory controller which would also mean it would then have to update its internal mapping to the RAM channel, module, rank, IC, chip, bank, row, column for all the other ranges
  • Peter Mortensen
    Peter Mortensen over 4 years
    How does this answer the question?
  • John
    John about 4 years
    I've also seen this comming from my dotnet core app when the disk was full (using memory-mapped io). Maybe that's realated to @Eloff's comment.
  • itaych
    itaych about 4 years
    Agreed, this is the most common cause of bus errors in my experience.
  • itaych
    itaych about 4 years
    If m >= n then the outer loop will execute once or not at all, depending on the preexisting value of i. If m < n then it will run indefinitely with the j index increasing, until you will run out of bounds of your array and most likely cause a segmentation fault, not a bus error. If this code compiles then there's no problem accessing the memory of the variable 'i' itself. Sorry but this answer is wrong.
  • stuxnetting
    stuxnetting almost 3 years
    It's been a while since I wrote that answer but I'm curious about your explanation. The code compiles (C/C++ does not initialize variables to a given value). Thus when the index is assigned a memory location by the compiler, the default value of that variable is whatever (garbage) value happens to already be in that memory location. I encountered said bus error when this as yet uninitialized index variable was compared against a known 'n'.
  • itaych
    itaych almost 3 years
    Comparing the uninitialized integer 'i' against 'n' will yield an unpredictable but valid result (i.e. either true or false), not a crash. There is no mechanism in C/C++ that can catch reads of uninitialized variables at runtime (except perhaps in a debugging environment such as valgrind).
  • qris
    qris about 2 years
    How does it differ from a segmentation fault? It can differ in the optimisation level that the code was compiled with.