`EINTR`: is there a rationale behind it?

8,690

Solution 1

It is difficult to do nontrivial things in a signal handler, since the rest of the program is in an unknown state. Most signal handlers just set a flag, which is later checked and handled elsewhere in the program.

Reason for not restarting the system call automatically:

Imagine an application which receives data from a socket by the blocking and uninterruptible recv() system call. In our scenario, data comes very slow and the program resides long in that system call. That program has a signal handler for SIGINT that sets a flag (which is evaluated elsewhere), and SA_RESTART is set that the system call restarts automatically. Imagine that the program is in recv() which waits for data. But no data arrives. The system call blocks. The program now catches ctrl-c from the user. The system call is interrupted and the signal handler, which just sets the flag is executed. Then recv() is restarted, still waiting for data. The event loop is stuck in recv() and has no opportunity to evaluate the flag and exit the program gracefully.

With SA_RESTART not set:

In the above scenario, when SA_RESTART is not set, recv() would recieve EINTR instead of being restarted. The system call exits and thus can continue. Off course, the program should then (as early as possible) check the flag (set by the signal handler) and do clean up or whatever it does.

Solution 2

Richard Gabriel wrote a paper The Rise of 'Worse is Better' which discusses the design choice here in Unix:

Two famous people, one from MIT and another from Berkeley (but working on Unix) once met to discuss operating system issues. The person from MIT was knowledgeable about ITS (the MIT AI Lab operating system) and had been reading the Unix sources. He was interested in how Unix solved the PC loser-ing problem. The PC loser-ing problem occurs when a user program invokes a system routine to perform a lengthy operation that might have significant state, such as IO buffers. If an interrupt occurs during the operation, the state of the user program must be saved. Because the invocation of the system routine is usually a single instruction, the PC of the user program does not adequately capture the state of the process. The system routine must either back out or press forward. The right thing is to back out and restore the user program PC to the instruction that invoked the system routine so that resumption of the user program after the interrupt, for example, re-enters the system routine. It is called PC loser-ing because the PC is being coerced into loser mode, where 'loser' is the affectionate name for 'user' at MIT.

The MIT guy did not see any code that handled this case and asked the New Jersey guy how the problem was handled. The New Jersey guy said that the Unix folks were aware of the problem, but the solution was for the system routine to always finish, but sometimes an error code would be returned that signaled that the system routine had failed to complete its action. A correct user program, then, had to check the error code to determine whether to simply try the system routine again. The MIT guy did not like this solution because it was not the right thing.

The New Jersey guy said that the Unix solution was right because the design philosophy of Unix was simplicity and that the right thing was too complex. Besides, programmers could easily insert this extra test and loop. The MIT guy pointed out that the implementation was simple but the interface to the functionality was complex. The New Jersey guy said that the right tradeoff has been selected in Unix-namely, implementation simplicity was more important than interface simplicity.

Share:
8,690

Related videos on Youtube

Hibou57
Author by

Hibou57

Interested in the following topic and technologies: Unicode XML Standard XML formats (e.g. DITA, XLIFF, SVG, …) HTML5 as a portable UI interface platform Isabelle/Isar/HOL Pure Prolog SML Ada Document authoring and management Planning to have experiments in: XML document authoring (text, not data oriented) Generation of Ada and/or ISO‑C programs from Isabelle/HOL proofs Program sources authoring with two‑ways traceability between sources and specifications Publishing commercial applications in the Ubuntu software Centre My active websites are: www.les-ziboux.rasama.org, a French website dedicated to the Arabic language (also comes with some software and Ada topics) www.lasidore.rasama.org, a tagless XML online editor, which comes with an English and French versions bulleforum.net, a French general purpose forum (also comes with scientific, technical and social topics).

Updated on September 18, 2022

Comments

  • Hibou57
    Hibou57 almost 2 years

    Small talk as background

    EINTR is the error which so-called interruptible system calls may return. If a signal occurs while a system call is running, that signal is not ignored. If a signal handler was defined for it without SA_RESTART set and this handler handles that signal, then the system call will return the EINTR error code.

    As a side note, I got this error very often using ncurses in Python.

    The question

    Is there a rationale behind this behaviour specified by the POSIX standard? One can understand it may be not possible to resume (depending on the kernel design), however, what's the rationale for not restarting it automatically at the kernel level? Is this for legacy or technical reasons? If this is for technical reasons, are these reasons still valid nowadays? If this is for legacy reasons, then what's the history?

  • Hibou57
    Hibou57 over 8 years
    Another point of view which may be worth added: skarnet.org/software/skalibs/libstddjb/safewrappers.html . It finally says the same (although more implicitly), except in your answer, you are assuming there may even be no time‑out at all.
  • Andrew Henle
    Andrew Henle over 8 years
    Even with SA_RESTART set, not all system calls are restarted automatically. For example, Linux does not restart msgsnd() or msgrcv().
  • G-Man Says 'Reinstate Monica'
    G-Man Says 'Reinstate Monica' over 6 years
    Two famous people walk into a bar — one from MIT, one from Berkeley, and one from New Jersey.  Huh? I realize that it’s a quote, but can you clarify it? The last paragraph is a bit muddled, too — the Unix solution was right because the right thing was too complex for Unix, and so they didn’t implement it.
  • Brad Schoening
    Brad Schoening over 6 years
    Minimum Viable Product is a related concept. The unix solution to use EINTR was viable and offered simplicity in the highly portable OS codebase. Delegated to user code, handling EINTR is easy (just retry), yet kind of bothersome.
  • Noobie
    Noobie over 3 years
    has no opportunity or has now opportunity ?
  • chaos
    chaos over 3 years
    @Noobie It's long ago when I've written this, but it should be no oppotunity.
  • PSkocik
    PSkocik about 3 years
    Unconditionally auto-restarting all long-blocking syscalls is the WRONG thing to do. Usercode should get a choice there on a call-by-call basis breaking the syscalls provides such choice. Restart loops are trivial and can be put in a library. Alternatively, some/all potentially long-blocking calls could have a don't-interrupt-me flag, which would be somewhat performance-friendlier.