Program received signal SIGSEGV, Segmentation fault (program runs out of stack.)

22,367

The error is shown at this line:

long a = thread_fake(); //in file1.c

The likely way this could case a SIGSEGV is if your program has run out of stack.

Examine actual crashing instruction in GDB with x/i $pc.

If the instruction is a PUSH, or a CALL, then my guess is confirmed.

Another possibility: you've compiled your code with optimization, and the actual faulting instruction has little to do with the source line it is attributed to.

Update:

Yes it gives a call call 0x804e580 <thread_fake>. What could be the solution?

The solution is to not run out of stack. Execute a GDB where command, then, in each frame leading to the crash, execute info frame and look for frames that are excessively large.

Don't allocate too much data on stack, or increase your stack size (ulimit -s).

valgrind shows the following error:

That is

  • not an error
  • has nothing to do with your problem

Update2:

How do I check the size of each frame?

Given this:

Stack level 0, frame at 0xffffc248:
...
Stack level 1, frame at 0xffffc250:
...
Stack level 2, frame at 0xffffc2a0:

the size of frame #1 is 8 (0xffffc250 - 0xffffc248), frame #2 is 80, etc.

Final Update:

It turned out that my procedure above failed to measure the size of frame#0, which turned out to be ... 61MB! due to presence of humongous local arrays (just as Grady Player correctly guessed).

Share:
22,367
ceedee
Author by

ceedee

Updated on March 20, 2020

Comments

  • ceedee
    ceedee about 4 years

    I get this error message when i run the program with gdb. The error is shown at this line:

    long a = thread_fake(); //in file1.c
    

    I was getting the problem with other function that was defined in a separate file, so i simplified it to a simple function that just returns 0. The function has been defined as:

    long thread_fake(){ //defined in file2.c
        return 0;
    }
    

    As @EmployedRussian pointed out, it seems the program runs out of stack. The valgrind shows the following error:

    ==14711== 144 bytes in 1 blocks are possibly lost in loss record 17 of 32
    ==14711==    at 0x4025315: calloc (vg_replace_malloc.c:467)
    ==14711==    by 0x4010CD7: allocate_dtv (dl-tls.c:300)
    ==14711==    by 0x401146B: _dl_allocate_tls (dl-tls.c:464)
    ==14711==    by 0x40475C6: pthread_create@@GLIBC_2.1 (allocatestack.c:570)
    ==14711==    by 0x8050583: tm_main_startup 
    ==14711==    by 0x8048F6B: main (genome.c:201)
    ==14711== 144 bytes in 1 blocks are possibly lost in loss record 18 of 32
    ==14711==    at 0x4025315: calloc (vg_replace_malloc.c:467)
    ==14711==    by 0x4010CD7: allocate_dtv (dl-tls.c:300)
    ==14711==    by 0x401146B: _dl_allocate_tls (dl-tls.c:464)
    ==14711==    by 0x40475C6: pthread_create@@GLIBC_2.1 (allocatestack.c:570)
    ==14711==    by 0x804DFE3: thread_startup (thread.c:151)
    ==14711==    by 0x8048F73: main (genome.c:203)
    

    All the threads created are joined a corresponding pthread_join call. Also i tried the sgcheck tool but it doesn't work on the platform'x86-linux'. Please help.

    The complete output of bt command:

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x406e8b70 (LWP 19416)]
    sequencer_run (argPtr=0x89fce00) at sequencer.c:251
    251 a = thread_fake();
    (gdb) bt
    #0  sequencer_run (argPtr=0x89fce00) at sequencer.c:251
    #1  0x0804e306 in threadWait (argPtr=0x89dc1f4) at ../lib/thread.c:105
    #2  0x4003be99 in start_thread (arg=0x406e8b70) at pthread_create.c:304
    #3  0x40253cbe in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
    
    • n. m.
      n. m. over 10 years
      Have you declared the function in file1.c or in a header file included by it?
    • ceedee
      ceedee over 10 years
      @n.m. in a header file file2.h that is included by file1.c
    • ceedee
      ceedee over 10 years
      @JamesMcLaughlin no..
    • Kevin
      Kevin over 10 years
      No repro and nothing I see immediately, unless you forgot to declare the function. There's something you're omitting. Turn on all warnings (gcc -Wall -Wextra -pedantic) and show us the whole command line you use for compiling.
    • ceedee
      ceedee over 10 years
      @GradyPlayer No.No linker warnings..
    • us2012
      us2012 over 10 years
      You write that you get this error when running the program with gdb. Does it occur when you're running it outside of gdb?
    • Crowman
      Crowman over 10 years
      Post the shortest, complete, compilable example that you can which reproduces the error.
    • Mike Makuch
      Mike Makuch over 10 years
      Sounds like could be a stack/memory corruption issue, when the problem moves around...
  • Grady Player
    Grady Player over 10 years
    that is a good possibility... the op should look for something like int huge[23456678899]; and not do that anymore.
  • ceedee
    ceedee over 10 years
    @EmployedRussian Yes it gives a call "call 0x804e580 <thread_fake>" What could be the solution?
  • ceedee
    ceedee over 10 years
    @EmployedRussian Also one more thing i would like to mention is that the program runs fine with single thread, but ends up with SIGSEGV on 2 or more threads..
  • ceedee
    ceedee over 10 years
    @EmployedRussian i also checked with the commands where and info frame. But how do i check the size of each frame. It just mentions like- Stack level 0, frame at 0x406e8370: eip = 0x8049a3c in sequencer_run (sequencer.c:251); saved eip 0x804e476 called by frame at 0x406e8390 source language c. Arglist at 0x406e8368, args: argPtr=0x89fde00 Locals at 0x406e8368, Previous frame's sp is 0x406e8370 Saved registers: ebx at 0x406e835c, ebp at 0x406e8368, esi at 0x406e8360, edi at 0x406e8364, eip at 0x406e836c
  • ceedee
    ceedee over 10 years
    @EmployedRussian From the info provided by i you, i get- 1) Stack level 0, frame at 0x406e8370; called by frame at 0x406e8390 2) Stack frame at 0x406e8390: called by frame at 0x406e8490, caller of frame at 0x406e8370 3) Stack frame at 0x406e8490: called by frame at 0x0, caller of frame at 0x406e8390 So from this info, i conclude that there are 3 frames , first one starting at 0x406e8490 of size 256(8490-8390), second staring at 0x406e8390 of size 32(8390-8370), and third one starting at 0x406e8370, whose size is not known. Please correct me if i am wrong. So is 256 too large?
  • ceedee
    ceedee over 10 years
    @EmployedRussian Moreover how much large is said to be too large?
  • Employed Russian
    Employed Russian over 10 years
    @ceedee No, 256 is not too large (unless your thread stack is absurdly small). Please edit your question to show complete output of GDB bt command.
  • Employed Russian
    Employed Russian over 10 years
    @ceedee So you are running out of stack almost immediately, without having consumed hardly any stack space. How exactly do you create this thread?
  • ceedee
    ceedee over 10 years