linux watchdog and systemd watchdog

systemd watchdog

28,791

Solution 1

Systemd's watchdog can be mainly used for 3 different actions:

hardware reset (leveraging the CPU hardware watchdog exposed at /dev/watchdog). This is enabled by the RuntimeWatchdogSec= option in /etc/systemd/system.conf
application reset, as long as this is foreseen in the systemd unit definition
system reset as a fallback measure in response to multiple unsuccessful application resets. Also defined in the systemd unit

example unit file:

[Unit]
Description=My Little Daemon
Documentation=man:mylittled(8)

[Service]
ExecStart=/usr/bin/mylittled
WatchdogSec=30s
Restart=on-failure
StartLimitInterval=5min
StartLimitBurst=4
StartLimitAction=reboot-force

The example is taken from: http://0pointer.de/blog/projects/watchdog.html, which gives a pretty complete overview of what and how you can use the watchdog service.

Solution 2

The Linux watchdog daemon should be used for system reset jobs, though it can also run a "repair binary" on persistent errors that could be used to fix or restart a process. Generally speaking, to monitor daemon processes and restart them you should use the init/upstart/systemd supported methods as already answered and keep the watchdog operation for the most serious "only a reboot is likely to fix things" situations.

28,791

LongLT

Updated on September 18, 2022

Comments

LongLT over 1 year

Any way to register application with systemd watchdog at runtime ? I mean don't use systemd unit file, via systemd API for example

Linux watchdog is used for system reset only ? Can it be used for application reset ?
Weeve Ferrelaine about 4 years

Example how to actually implement software watchdog for application reset would be useful. That Example unit file doesn't help at all out of context.
solr almost 3 years

Nice to know, thanks Paul.