Linux automatically restarting application on crash - Daemons
Solution 1
The gist of it is:
- You need to detect if the program is still running and not hung.
- You need to (re)start the program if the program is not running or is hung.
There are a number of different ways to do #1, but two that come to mind are:
Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.
Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.
With respect to #2, killing the previous PID and using fork+exec to launch a new process is typical. You might also consider making your application that runs "continuously", into an application that runs once, but then use "cron" or some other application to continuously rerun that single-run application.
Unfortunately, watchdog timers and getting out of deadlock are non-trivial issues. I don't know of any generic way to do it, and the few that I've seen are pretty ugly and not 100% bug-free. However, tsan can help detect potential deadlock scenarios and other threading issues with static analysis.
Solution 2
You can seamlessly restart your process as it dies with fork
and waitpid
as described in this answer. It does not cost any significant resources, since the OS will share the memory pages.
Which leaves only the problem of detecting a hung process. You can use any of the solutions pointed out by Michael Aaron Safyan for this, but a yet easier solution would be to use the alarm
syscall repeatedly, having the signal terminate the process (use sigaction accordingly). As long as you keep calling alarm
(i.e. as long as your program is running) it will keep running. Once you don't, the signal will fire.
That way, no extra programs needed, and only portable POSIX stuff used.
Solution 3
use this script for running your application
#!/bin/bash
while ! /path/to/program #This will wait for the program to exit successfully.
do
echo “restarting” # Else it will restart.
done
you can also put this script on your /etc/init.d/
in other to start as daemon
Solution 4
You could create a CRON job to check if the process is running with start-stop-daemon from time to time.
user623879
Updated on June 07, 2022Comments
-
user623879 almost 2 years
I have an system running embedded linux and it is critical that it runs continuously. Basically it is a process for communicating to sensors and relaying that data to database and web client.
If a crash occurs, how do I restart the application automatically?
Also, there are several threads doing polling(eg sockets & uart communications). How do I ensure none of the threads get hung up or exit unexpectedly? Is there an easy to use watchdog that is threading friendly?
-
user623879 over 12 yearsAny out of the box daemons to watch daemons and restart them haha?
-
Hasturkun over 12 yearsOn many embedded platforms you can have your watchdog daemon prod a hardware watchdog, ensuring the watchdog doesn't die
-
Brooks Moses over 12 yearsI'd like to second the suggestion of "Adjust your application so that it only runs once, and then rerun that single-run application repeatedly." If this is possible, it will significantly simplify the detection process.
-
Matthieu almost 6 yearsAnd make sure you put at least one instruction between
do
anddone
.