perf fails to work on really large process

5,854

I experienced something similar, but less severe.

In my case, I would get the CONFIG_PERF_EVENTS error maybe 10% of the time. The rest of the time perf record would complete without error, but it would fail to record any samples (looking at the output file with perf report would state "file has no samples", and perf report -D ... | grep -c RECORD_SAMPLE would confirm a count of zero).

Following the advice in this stackoverflow post, I added the -e cpu-clock parameter to perf record. This did not fix the CONFIG_PERF_EVENTS error, and perf record would still take 2x-4x as long as I specified via sleep - but at least it worked. Maybe it will help you, too.

Share:
5,854

Related videos on Youtube

Shlomi
Author by

Shlomi

Updated on September 18, 2022

Comments

  • Shlomi
    Shlomi over 1 year

    I use perf quite often to profile my applications. Recently I got a few new machine with ~750G RAM each. I am trying to profile a process that uses about 400G of ram on one of them. This works fine on all the new machines but one. They were all installed the same way, running ubuntu.

    I tried both perf as installed through apt-get and compiled my own from git.

    When running ./perf top -p 14182 I some time manage to get results (after a long wait, and very rarely) but most other times I am getting:

    The sys_perf_event_open() syscall returned with 3 (No such process) for event (cycles:pp).
    /bin/dmesg may provide additional information.
    No CONFIG_PERF_EVENTS=y kernel support configured?
    

    For the same pid(!).

    I get similar problems when trying to record, for instance:

    time ./perf record -F 111 -a -g -p 14182 -- sleep 3
    Warning:
    PID/TID switch overriding SYSTEMsleep: Terminated
    
    Command exited with non-zero status 255
    2.66user 91.58system 1:36.68elapsed 97%CPU (0avgtext+0avgdata 4896maxresident)k 
    0inputs+5248outputs (0major+4847minor)pagefaults 0swaps
    

    This command takes way longer than 3 seconds, I cannot ctrl+c it, and killing it requires -9. It emits some data into perf.data but when I try to view it, i get file has no samples.

    Everything works just fine on other (smaller) processes on this machine, and as I mentioned earlier this works great on the rest of the machines, for processes weighting about 200G. When I run the previous command on the other machines, it also uses much less cpu (33% rather than 97%).

    I am not sure where else to look, and google has not been very helpful for me. Any ideas or directions?

    Edit

    Seeing how system was high, I ran this with strace -c, this is what I get:

    Another machines with a process weighting ~200G:

    _ strace  -c perf record -F 111 -a -g -p 27879 -- sleep 3 
    
    Warning:
    PID/TID switch overriding SYSTEM[ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.684 MB perf.data (~29876 samples) ]
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     91.43    2.548664         109     23481           read
      2.47    0.068712          11      6457        15 stat
      2.42    0.067392           9      7303           write
      0.68    0.018960          12      1646           getdents
      0.68    0.018841          22       847         1 mmap
    

    This machine with a process weighting ~400G:

    _ strace -c perf record -F 111 -a -g -p 14182 -- sleep 3                                                                                                                                                                                           /tmpvaknins9@snlp-brevis-3
    Warning:
    PID/TID switch overriding SYSTEMsleep: Terminated
    sleep: Terminated
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     99.70  131.243199         472    278178           read
      0.16    0.207757           8     25895           write
      0.04    0.048962           8      6398         5 stat
      0.02    0.031247          14      2217           mmap
      0.02    0.026385           9      3091           close
    

    It seems lots of time is being spent on reading /proc/tid/maps|status.. Perhaps my process is simply too large? Any clues how I could still get it to record this process?

    Thanks!