perf interrupt took too long but perf not being installed

kernel-panic perf

26,482

Solution 1

This message comes from the linux kernel. More precisely it come from the perf_duration function in linux/kernel/events/core.c :

static void perf_duration_warn(struct irq_work *w)
{
    printk_ratelimited(KERN_INFO
        "perf: interrupt took too long (%lld > %lld), lowering "
        "kernel.perf_event_max_sample_rate to %d\n",
        __report_avg, __report_allowed,
        sysctl_perf_event_sample_rate);
}

I don't know what you precisely mean by :

Is this a sign of an oncoming storm?

but I suspect problems with one of your devices.

P.S.: If you read carefully, you will see that in the code the message is perf: interrupt took too long but your message is perf interrupt took too long. The colon was added in kernel version 4.6.

Solution 2

I've had a similar message for some time now on my Desktop system. It shows up after one or sometimes several cores stall in uninterruptable disk I/O (D in ps) for minutes or longer. I suspect some race condition in I/O scheduling which leads to deadlock, but don't know how to debug this. Switching to the deadline scheduler for the appropriate disk instead of CFQ seems to help:

# echo deadline > /sys/block/sdX/queue/scheduler

I have observed short pauses in scheduling with that, but the second queue of the deadline scheduler seems to mitigate the long stall.

If somebody could shed some more light on this, I'd also appreciate it.

Edit

I don't know if the rcu_sched errors/warnings are related, but it's quite possible. I don't get them, possible because my kernel is configured differently.

When one core is stalled, what i see with ps is

$ ps axu | grep ' D'
dirk      4720 13.0  5.1 1615772 842444 pts/3  Dl+  07:27  24:54 iceweasel -P default

for the process that was doing the I/O. D means "uninterruptible sleep (usually I/O)" according to man ps.

26,482

Martin B.

Hello! I'm about ^2[0-9]$ years old, and currently studying electrical engineering and information technology at the TU Vienna. For fun I like to code C++ and sometimes Java. I have taught myself (sometimes with a little help from my dad) various languages like PHP, Perl, Python, JavaScript, MySQL, C/C++, Java, ... Also I have since ~2011 my own server which I'm maintaining. At first I just provided a Minecraft server for my class mates and me but it has grown eversince. I also like to reinvent the wheel and have my own implementations of already established services. I guess I kind of like the smell of a raw implementation. An example of this is my APT repository which only contains a single package. But this package is up to date at least. And as soon as I heard of GitLab I did set up a GitLab CE server too, because why not! I'm also hosting a TeamSpeak3 server on this machine and it is very well populated with people from my old school. That's only the tip of the iceberg. I reinvented a transport control to place on top of UDP. I discontinued the project instead of using TCP :D I recently got into kernel modules. But I have no clue what I want to archive so I just tried to compile an example for a C++ kernel module (I know, I know, C++ and kernel) and it worked.

Updated on September 18, 2022

Comments

Martin B. almost 2 years
I just now checked my dmesg because my server starts to crash now and then. There I read the following line:
```
perf interrupt took too long (2528 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
```
which appears a couple of times.
I remember perf being a performance analytics tool and not remember having it installed. So I checked:
```
~$ dpkg -l *perf*
dpkg-query: no packages found matching *perf*
```
My questions:
- Is this a sign of an oncoming storm. Because this line comes a few times and then there are stackdumps starting with rcu_sched detected stalls
- Where do these come from?
Martin B. about 7 years

I mean if the later happening cpu stalling announces itself by prolonging the perf interrupt duration.
Ortomala Lokni about 7 years

Difficult to say. Try to investigate by booting the system in rescue mode.
Martin B. about 7 years

I recently had some other problems where my support said to change my queue/scheduler to noop. Could this be related?
dirkt about 7 years

Maybe, depends on what the other problems were, what you told support, and what support said exactly. Links?
Setop over 6 years

can you tell how to persist such configuration to remain after a reboot, please ?
dirkt over 6 years

@Setop Use whatever sysctl facility your distribution uses, e.g. /etc/sysctl.d/. Though I found out in the meantime that while the deadline scheduler helps, there are still hangups. Upgrading to a never kernel didn't change anything. Did you run into the same problem?
Setop over 6 years

@dirkt, same here. It helps a bit but I still get kernel freeze :(