Linux automatically restarting application on crash - Daemons

13,277

Solution 1

The gist of it is:

  1. You need to detect if the program is still running and not hung.
  2. You need to (re)start the program if the program is not running or is hung.

There are a number of different ways to do #1, but two that come to mind are:

  1. Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.

  2. Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.

With respect to #2, killing the previous PID and using fork+exec to launch a new process is typical. You might also consider making your application that runs "continuously", into an application that runs once, but then use "cron" or some other application to continuously rerun that single-run application.

Unfortunately, watchdog timers and getting out of deadlock are non-trivial issues. I don't know of any generic way to do it, and the few that I've seen are pretty ugly and not 100% bug-free. However, tsan can help detect potential deadlock scenarios and other threading issues with static analysis.

Solution 2

You can seamlessly restart your process as it dies with fork and waitpid as described in this answer. It does not cost any significant resources, since the OS will share the memory pages.

Which leaves only the problem of detecting a hung process. You can use any of the solutions pointed out by Michael Aaron Safyan for this, but a yet easier solution would be to use the alarm syscall repeatedly, having the signal terminate the process (use sigaction accordingly). As long as you keep calling alarm (i.e. as long as your program is running) it will keep running. Once you don't, the signal will fire.
That way, no extra programs needed, and only portable POSIX stuff used.

Solution 3

use this script for running your application

#!/bin/bash

while ! /path/to/program   #This will wait for the program to exit successfully.
do
echo “restarting”                  # Else it will restart.
done

you can also put this script on your /etc/init.d/ in other to start as daemon

Solution 4

You could create a CRON job to check if the process is running with start-stop-daemon from time to time.

Share:
13,277
user623879
Author by

user623879

Updated on June 07, 2022

Comments

  • user623879
    user623879 almost 2 years

    I have an system running embedded linux and it is critical that it runs continuously. Basically it is a process for communicating to sensors and relaying that data to database and web client.

    If a crash occurs, how do I restart the application automatically?

    Also, there are several threads doing polling(eg sockets & uart communications). How do I ensure none of the threads get hung up or exit unexpectedly? Is there an easy to use watchdog that is threading friendly?

  • user623879
    user623879 over 12 years
    Any out of the box daemons to watch daemons and restart them haha?
  • Hasturkun
    Hasturkun over 12 years
    On many embedded platforms you can have your watchdog daemon prod a hardware watchdog, ensuring the watchdog doesn't die
  • Brooks Moses
    Brooks Moses over 12 years
    I'd like to second the suggestion of "Adjust your application so that it only runs once, and then rerun that single-run application repeatedly." If this is possible, it will significantly simplify the detection process.
  • Matthieu
    Matthieu almost 6 years
    And make sure you put at least one instruction between do and done.