Valgrind hanging to profile a multi threaded program

10,008

May be you can check this valgrind stalls in multithreaded socket program Valgrind forces application to run on single core, not sure if that can cause problems for your case.

Share:
10,008
D. L. Kumar
Author by

D. L. Kumar

Code, Code, Code

Updated on June 09, 2022

Comments

  • D. L. Kumar
    D. L. Kumar almost 2 years

    I have a multithreaded program (Implemented in C using Pthreads on Linux platform) that runs on a multicore machine. I am using ValGrind with --memcheck option to find some memory issues that I have in my code. But it hangs. To give a complete overview of the problem, here is the background.

    The code has some sequential part at the start as part of initialization and later it creates 8 threads (using Pthread API) and rungs to completion. My code dumps "core" after sometime. I used GDB, it gives the following trace.

    ======= Backtrace: =========  
    /lib/tls/i686/cmov/libc.so.6[0xb7cd47cd]  
    /lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7cd7e30]  
    /home/kumar/CycleSim/slack_cp/sim-outorder[0x819a6c9]  
    /home/kumar/CycleSim/slack_cp/sim-outorder[0x8167e3e]  
    /home/kumar/CycleSim/slack_cp/sim-outorder[0x804f5e4]  
    /lib/tls/i686/cmov/libpthread.so.0[0xb7f8c31b]  
    /lib/tls/i686/cmov/libc.so.6(clone+0x5e)[0xb7d3c57e]  
    ======= Memory map: ========  
    08048000-081b5000 r-xp 00000000 08:11 11813248  
    /home/kumar/CycleSim/slack_cp/sim-outorder  
    081b5000-081b8000 rw-p 0016c000 08:11 11813248  
    /home/kumar/CycleSim/slack_cp/sim-outorder  
    081b8000-08549000 rw-p 081b8000 00:00 0          [heap]  
    ab9fd000-ab9fe000 ---p ab9fd000 00:00 0  
    ab9fe000-ac1fe000 rw-p ab9fe000 00:00 0  
    ac1fe000-ac1ff000 ---p ac1fe000 00:00 0  
    ac1ff000-ac9ff000 rw-p ac1ff000 00:00 0  
    ac9ff000-aca00000 ---p ac9ff000 00:00 0  
    aca00000-ad2cb000 rw-p aca00000 00:00 0  
    ad2cb000-ad300000 ---p ad2cb000 00:00 0  
    ad3bf000-ad3c0000 ---p ad3bf000 00:00 0  
    ad3c0000-adbc0000 rw-p ad3c0000 00:00 0  
    adbc0000-adbc1000 ---p adbc0000 00:00 0  
    adbc1000-ae3c1000 rw-p adbc1000 00:00 0  
    ae3c1000-ae3c2000 ---p ae3c1000 00:00 0  
    ae3c2000-aebc2000 rw-p ae3c2000 00:00 0  
    aebc2000-aebc3000 ---p aebc2000 00:00 0  
    aebc3000-b2e7d000 rw-p aebc3000 00:00 0  
    b2e7d000-b2e7e000 ---p b2e7d000 00:00 0  
    b2e7e000-b367e000 rw-p b2e7e000 00:00 0  
    b367e000-b367f000 ---p b367e000 00:00 0  
    b367f000-b7c6d000 rw-p b367f000 00:00 0  
    b7c6d000-b7da8000 r-xp 00000000 08:01 12895490   /lib/tls/i686/cmov/libc-2.5.so  
    b7da8000-b7da9000 r--p 0013b000 08:01 12895490   /lib/tls/i686/cmov/libc-2.5.so  
    b7da9000-b7dab000 rw-p 0013c000 08:01 12895490   /lib/tls/i686/cmov/libc-2.5.so  
    b7dab000-b7dae000 rw-p b7dab000 00:00 0  
    b7dae000-b7dde000 r-xp 00000000 08:21 3828021    /usr/lib/libgslcblas.so.0.0.0  
    b7dde000-b7ddf000 rw-p 0002f000 08:21 3828021    /usr/lib/libgslcblas.so.0.0.0  
    b7ddf000-b7f7d000 r-xp 00000000 08:21 3828022    /usr/lib/libgsl.so.0.9.0  
    b7f7d000-b7f87000 rw-p 0019d000 08:21 3828022    /usr/lib/libgsl.so.0.9.0  
    b7f87000-b7f9a000 r-xp 00000000 08:01 12895516  
    /lib/tls/i686/cmov/libpthread-2.5.so  
    b7f9a000-b7f9c000 rw-p 00013000 08:01 12895516  
    /lib/tls/i686/cmov/libpthread-2.5.so  
    b7f9c000-b7f9f000 rw-p b7f9c000 00:00 0  
    b7f9f000-b7fc4000 r-xp 00000000 08:01 12895498   /lib/tls/i686/cmov/libm-2.5.so  
    b7fc4000-b7fc6000 rw-p 00024000 08:01 12895498   /lib/tls/i686/cmov/libm-2.5.so  
    b7fc9000-b7fd4000 r-xp 00000000 08:01 12861504   /lib/libgcc_s.so.1  
    b7fd4000-b7fd5000 rw-p 0000a000 08:01 12861504   /lib/libgcc_s.so.1  
    b7fd5000-b7fd9000 rw-p b7fd5000 00:00 0  
    b7fd9000-b7ff2000 r-xp 00000000 08:01 12861461   /lib/ld-2.5.so  
    b7ff2000-b7ff4000 rw-p 00019000 08:01 12861461   /lib/ld-2.5.so  
    bf8a0000-bf8b5000 rw-p bf8a0000 00:00 0          [stack]  
    ffffe000-fffff000 r-xp 00000000 00:00 0          [vdso]  
    

    Though I used -g option and no O flags it does not give the exact code location where the problem exists. I

    After searching over the internet I understood that, it comes because I am corrupting the memory. Either writing data in array out of bounds (Yes, I am using big array, but I am checking explicitly before accessing every element in the array) or accessing an illegal heap memory. But as the code is huge, I could not figure it out just looking at it. So I turned to ValGrind for this to see where memory corruption is happening. I ran the code with ValGrind, it runs well till sequential part of the code, but when it comes to parallel part (Pthread creation part), It is not doing any thing. With the help of "top -H -p pid" I see that all threads are created, but they are in sleep mode. The original code (without valgrind) does not hang which I ran for a long time (But I cannot give guarantee that it is deadlock free). Is using Helgrind (Thread error detector of valgrind) any useful?

    Can anyone point me to the document or similar issue. It is ValGrind version 2. Machine is i686, Linux operating system.

    Thanks D. L. Kumar

  • D. L. Kumar
    D. L. Kumar about 12 years
    Thats interesting point and I think this might be definitely the issue. Thank you for pointing this.I am mapping all threads explicitly (taking care of scheduling by myself) to run on 8 cores (my host machine is 8 core machine). As Valgrind is basically emulation of x86 platform, may it can run only on one core.
  • D. L. Kumar
    D. L. Kumar about 12 years
    Now that, I made all threads run on single core (just for debugging now) it is not hanging. But I wonder if its the case what is the usefullness of these tools. The true concurrency issue comes only when they are run on multicore. There is Helgrind which is a thread concurrency checker. Atleast it should be parallel one.
  • Malkocoglu
    Malkocoglu about 12 years
    I remember using valgrind on multi-threaded programs with lots of alive client connections and it worked without a problem, just a little bit slow...