What exactly is a reentrant function?

93,802

Solution 1

1. How is safely defined?

Semantically. In this case, this is not a hard-defined term. It just mean "You can do that, without risk".

2. If a program can be safely executed concurrently, does it always mean that it is reentrant?

No.

For example, let's have a C++ function that takes both a lock, and a callback as a parameter:

#include <mutex>

typedef void (*callback)();
std::mutex m;

void foo(callback f)
{
    m.lock();
    // use the resource protected by the mutex

    if (f) {
        f();
    }

    // use the resource protected by the mutex
    m.unlock();
}

Another function could well need to lock the same mutex:

void bar()
{
    foo(nullptr);
}

At first sight, everything seems ok… But wait:

int main()
{
    foo(bar);
    return 0;
}

If the lock on mutex is not recursive, then here's what will happen, in the main thread:

  1. main will call foo.
  2. foo will acquire the lock.
  3. foo will call bar, which will call foo.
  4. the 2nd foo will try to acquire the lock, fail and wait for it to be released.
  5. Deadlock.
  6. Oops…

Ok, I cheated, using the callback thing. But it's easy to imagine more complex pieces of code having a similar effect.

3. What exactly is the common thread between the six points mentioned that I should keep in mind while checking my code for reentrant capabilities?

You can smell a problem if your function has/gives access to a modifiable persistent resource, or has/gives access to a function that smells.

(Ok, 99% of our code should smell, then… See last section to handle that…)

So, studying your code, one of those points should alert you:

  1. The function has a state (i.e. access a global variable, or even a class member variable)
  2. This function can be called by multiple threads, or could appear twice in the stack while the process is executing (i.e. the function could call itself, directly or indirectly). Function taking callbacks as parameters smell a lot.

Note that non-reentrancy is viral : A function that could call a possible non-reentrant function cannot be considered reentrant.

Note, too, that C++ methods smell because they have access to this, so you should study the code to be sure they have no funny interaction.

4.1. Are all recursive functions reentrant?

No.

In multithreaded cases, a recursive function accessing a shared resource could be called by multiple threads at the same moment, resulting in bad/corrupted data.

In singlethreaded cases, a recursive function could use a non-reentrant function (like the infamous strtok), or use global data without handling the fact the data is already in use. So your function is recursive because it calls itself directly or indirectly, but it can still be recursive-unsafe.

4.2. Are all thread-safe functions reentrant?

In the example above, I showed how an apparently threadsafe function was not reentrant. OK, I cheated because of the callback parameter. But then, there are multiple ways to deadlock a thread by having it acquire twice a non-recursive lock.

4.3. Are all recursive and thread-safe functions reentrant?

I would say "yes" if by "recursive" you mean "recursive-safe".

If you can guarantee that a function can be called simultaneously by multiple threads, and can call itself, directly or indirectly, without problems, then it is reentrant.

The problem is evaluating this guarantee… ^_^

5. Are the terms like reentrance and thread safety absolute at all, i.e. do they have fixed concrete definitions?

I believe they do, but then, evaluating a function is thread-safe or reentrant can be difficult. This is why I used the term smell above: You can find a function is not reentrant, but it could be difficult to be sure a complex piece of code is reentrant

6. An example

Let's say you have an object, with one method that needs to use a resource:

struct MyStruct
{
    P * p;

    void foo()
    {
        if (this->p == nullptr)
        {
            this->p = new P();
        }

        // lots of code, some using this->p

        if (this->p != nullptr)
        {
            delete this->p;
            this->p = nullptr;
        }
    }
};

The first problem is that if somehow this function is called recursively (i.e. this function calls itself, directly or indirectly), the code will probably crash, because this->p will be deleted at the end of the last call, and still probably be used before the end of the first call.

Thus, this code is not recursive-safe.

We could use a reference counter to correct this:

struct MyStruct
{
    size_t c;
    P * p;

    void foo()
    {
        if (c == 0)
        {
            this->p = new P();
        }

        ++c;
        // lots of code, some using this->p
        --c;

        if (c == 0)
        {
            delete this->p;
            this->p = nullptr;
        }
    }
};

This way, the code becomes recursive-safe… But it is still not reentrant because of multithreading issues: We must be sure the modifications of c and of p will be done atomically, using a recursive mutex (not all mutexes are recursive):

#include <mutex>

struct MyStruct
{
    std::recursive_mutex m;
    size_t c;
    P * p;

    void foo()
    {
        m.lock();

        if (c == 0)
        {
            this->p = new P();
        }

        ++c;
        m.unlock();
        // lots of code, some using this->p
        m.lock();
        --c;

        if (c == 0)
        {
            delete this->p;
            this->p = nullptr;
        }

        m.unlock();
    }
};

And of course, this all assumes the lots of code is itself reentrant, including the use of p.

And the code above is not even remotely exception-safe, but this is another story… ^_^

7. Hey 99% of our code is not reentrant!

It is quite true for spaghetti code. But if you partition correctly your code, you will avoid reentrancy problems.

7.1. Make sure all functions have NO state

They must only use the parameters, their own local variables, other functions without state, and return copies of the data if they return at all.

7.2. Make sure your object is "recursive-safe"

An object method has access to this, so it shares a state with all the methods of the same instance of the object.

So, make sure the object can be used at one point in the stack (i.e. calling method A), and then, at another point (i.e. calling method B), without corrupting the whole object. Design your object to make sure that upon exiting a method, the object is stable and correct (no dangling pointers, no contradicting member variables, etc.).

7.3. Make sure all your objects are correctly encapsulated

No one else should have access to their internal data:

    // bad
    int & MyObject::getCounter()
    {
        return this->counter;
    }

    // good
    int MyObject::getCounter()
    {
        return this->counter;
    }

    // good, too
    void MyObject::getCounter(int & p_counter)
    {
        p_counter = this->counter;
    }

Even returning a const reference could be dangerous if the user retrieves the address of the data, as some other portion of the code could modify it without the code holding the const reference being told.

7.4. Make sure the user knows your object is not thread-safe

Thus, the user is responsible to use mutexes to use an object shared between threads.

The objects from the STL are designed to be not thread-safe (because of performance issues), and thus, if a user want to share a std::string between two threads, the user must protect its access with concurrency primitives;

7.5. Make sure your thread-safe code is recursive-safe

This means using recursive mutexes if you believe the same resource can be used twice by the same thread.

Solution 2

"Safely" is defined exactly as the common sense dictates - it means "doing its thing correctly without interfering with other things". The six points you cite quite clearly express the requirements to achieve that.

The answers to your 3 questions is 3× "no".


Are all recursive functions reentrant?

NO!

Two simultaneous invocations of a recursive function can easily screw up each other, if they access the same global/static data, for example.


Are all thread-safe functions reentrant?

NO!

A function is thread-safe if it doesn't malfunction if called concurrently. But this can be achieved e.g. by using a mutex to block the execution of the second invocation until the first finishes, so only one invocation works at a time. Reentrancy means executing concurrently without interfering with other invocations.


Are all recursive and thread-safe functions reentrant?

NO!

See above.

Solution 3

The common thread:

Is the behavior well defined if the routine is called while it is interrupted?

If you have a function like this:

int add( int a , int b ) {
  return a + b;
}

Then it is not dependent upon any external state. The behavior is well defined.

If you have a function like this:

int add_to_global( int a ) {
  return gValue += a;
}

The result is not well defined on multiple threads. Information could be lost if the timing was just wrong.

The simplest form of a reentrant function is something that operates exclusively on the arguments passed and constant values. Anything else takes special handling or, often, is not reentrant. And of course the arguments must not reference mutable globals.

Solution 4

Now I have to elaborate on my previous comment. @paercebal answer is incorrect. In the example code didn't anyone notice that the mutex which as supposed to be parameter wasn't actually passed in?

I dispute the conclusion, I assert: for a function to be safe in the presence of concurrency it must be re-entrant. Therefore concurrent-safe (usually written thread-safe) implies re-entrant.

Neither thread safe nor re-entrant have anything to say about arguments: we're talking about concurrent execution of the function, which can still be unsafe if inappropriate parameters are used.

For example, memcpy() is thread-safe and re-entrant (usually). Obviously it will not work as expected if called with pointers to the same targets from two different threads. That's the point of the SGI definition, placing the onus on the client to ensure accesses to the same data structure are synchronised by the client.

It is important to understand that in general it is nonsense to have thread-safe operation include the parameters. If you've done any database programming you will understand. The concept of what is "atomic" and might be protected by a mutex or some other technique is necessarily a user concept: processing a transaction on a database can require multiple un-interrupted modifications. Who can say which ones need to be kept in sync but the client programmer?

The point is that "corruption" doesn't have to be messing up the memory on your computer with unserialised writes: corruption can still occur even if all individual operations are serialised. It follows that when you're asking if a function is thread-safe, or re-entrant, the question means for all appropriately separated arguments: using coupled arguments does not constitute a counter-example.

There are many programming systems out there: Ocaml is one, and I think Python as well, which have lots of non-reentrant code in them, but which uses a global lock to interleave thread acesss. These systems are not re-entrant and they're not thread-safe or concurrent-safe, they operate safely simply because they prevent concurrency globally.

A good example is malloc. It is not re-entrant and not thread-safe. This is because it has to access a global resource (the heap). Using locks doesn't make it safe: it's definitely not re-entrant. If the interface to malloc had be design properly it would be possible to make it re-entrant and thread-safe:

malloc(heap*, size_t);

Now it can be safe because it transfers the responsibility for serialising shared access to a single heap to the client. In particular no work is required if there are separate heap objects. If a common heap is used, the client has to serialise access. Using a lock inside the function is not enough: just consider a malloc locking a heap* and then a signal comes along and calls malloc on the same pointer: deadlock: the signal can't proceed, and the client can't either because it is interrupted.

Generally speaking, locks do not make things thread-safe .. they actually destroy safety by inappropriately trying to manage a resource that is owned by the client. Locking has to be done by the object manufacturer, thats the only code that knows how many objects are created and how they will be used.

Solution 5

The "common thread" (pun intended!?) amongst the points listed is that the function must not do anything that would affect the behaviour of any recursive or concurrent calls to the same function.

So for example static data is an issue because it is owned by all threads; if one call modifies a static variable the all threads use the modified data thus affecting their behaviour. Self modifying code (although rarely encountered, and in some cases prevented) would be a problem, because although there are multiple thread, there is only one copy of the code; the code is essential static data too.

Essentially to be re-entrant, each thread must be able to use the function as if it were the only user, and that is not the case if one thread can affect the behaviour of another in a non-deterministic manner. Primarily this involves each thread having either separate or constant data that the function works on.

All that said, point (1) is not necessarily true; for example, you might legitimately and by design use a static variable to retain a recursion count to guard against excessive recursion or to profile an algorithm.

A thread-safe function need not be reentrant; it may achieve thread safety by specifically preventing reentrancy with a lock, and point (6) says that such a function is not reentrant. Regarding point (6), a function that calls a thread-safe function that locks is not safe for use in recursion (it will dead-lock), and is therefore not said to be reentrant, though it may nonetheless safe for concurrency, and would still be re-entrant in the sense that multiple threads can have their program-counters in such a function simultaneously (just not with the locked region). May be this helps to distinguish thread-safety from reentarncy (or maybe adds to your confusion!).

Share:
93,802

Related videos on Youtube

Lazer
Author by

Lazer

Updated on October 24, 2021

Comments

  • Lazer
    Lazer over 2 years

    Most of the times, the definition of reentrance is quoted from Wikipedia:

    A computer program or routine is described as reentrant if it can be safely called again before its previous invocation has been completed (i.e it can be safely executed concurrently). To be reentrant, a computer program or routine:

    1. Must hold no static (or global) non-constant data.
    2. Must not return the address to static (or global) non-constant data.
    3. Must work only on the data provided to it by the caller.
    4. Must not rely on locks to singleton resources.
    5. Must not modify its own code (unless executing in its own unique thread storage)
    6. Must not call non-reentrant computer programs or routines.

    How is safely defined?

    If a program can be safely executed concurrently, does it always mean that it is reentrant?

    What exactly is the common thread between the six points mentioned that I should keep in mind while checking my code for reentrant capabilities?

    Also,

    1. Are all recursive functions reentrant?
    2. Are all thread-safe functions reentrant?
    3. Are all recursive and thread-safe functions reentrant?

    While writing this question, one thing comes to mind: Are the terms like reentrance and thread safety absolute at all i.e. do they have fixed concrete definitions? For, if they are not, this question is not very meaningful.

    • Admin
      Admin almost 14 years
      Actually, I disagree with #2 in the first list. You can return a address to whatever you like from a re-entrant function - the limitation is on what you do with that address in the calling code.
    • Robben_Ford_Fan_boy
      Robben_Ford_Fan_boy almost 14 years
      @Neil But as the writer of the reentrant function can't control what the caller surely surely they must not return an address to static (or global) non-constant data for it to be truely reentrant?
    • Admin
      Admin almost 14 years
      @drelihan It is not the responsibility of the writer of ANY function (reentrant or not) to control what a caller does with a returned value. They should certainly say what the caller CAN do with it, but if the caller chooses to do something else - tough luck to the caller.
  • Sharon
    Sharon almost 14 years
    To quibble a bit, I actually think in this case "safety" is defined - it means that the function will act only on variables provided - ie, it's shorthand for the definition quote below it. And point is that this might not imply other ideas of safety.
  • detly
    detly almost 14 years
    Did you miss passing in the mutex in the first example?
  • Yttrill
    Yttrill over 13 years
    @paercebal: your example is wrong. You do not actually need to bother with the callback, a simple recursion would have the same problem if there is one, however the only problem is you forgot to say exactly where the lock is allocated.
  • paercebal
    paercebal over 13 years
    @Yttrill: I assume you're talking about the first example. I used the "callback" because, by essence, a callback smells. Of course, a recursive function would have the same problem, but usually, one can easily analyse a function and its recursive nature, an thus, detect if it's reentrant or it's ok for recursivity. The callback, in the other hands, means that the author of the function calling the callback has no info whatsoever about what's the callback doing, so this author can find it difficult to make sure his/her function is reentrant. This is this difficulty I wanted to show.
  • paercebal
    paercebal over 13 years
    @Yttrill: As for the "however the only problem is you forgot to say exactly where the lock is allocated", I believe you missed the point (I did modify the code to add mutex declaration). Anyway, assuming the mutex was correctly initialized (if not, you have other problems than reentrancy) the first code is Ok as long as it won't be called again in the same stack OR the mutex is recursive. But if the mutex is NOT recursive, AND the function called is recursively called (directly, or through a callback, as in the example), then you're screwed, and the second lock will stop the stack's execution.
  • Iulius Curt
    Iulius Curt almost 12 years
    It's quite a dilemma here: the function calls itself. A function is not reentrant if it calls nonreentrant functions. So it is reentrant if itself is reentrant and is not if itself is not. Recursive magic.
  • Aquarius_Girl
    Aquarius_Girl almost 12 years
    "Ok, I cheated, using the Callback thing. " What made you say this? Isn't using callback a valid act?
  • paercebal
    paercebal almost 12 years
    @Anisha Kaul : Isn't using callback a valid act? Yes it is, even if it is a smelly one (giving a callback, lambda, etc. as a parameter is somewhat giving the control on the stack to the user of your code, meaning that this user could screw it up without you being able to do anything. My "cheat" was more about the oversimplicity of my example, which is bordersome absurd (calling a function A, and giving it A as a callback)... :-) ... I see little code in real life doing that... :-)
  • supercat
    supercat almost 11 years
    If a function acquires a lock during some parts of its operation, and calls itself recursively during times it does not hold the lock, would not such a function be thread-safe and recursive but not reentrant?
  • paercebal
    paercebal about 6 years
    @Gab是好人 : I corrected the first example. Thanks! A signal handler would come with its own problems, distinct from reentrancy, as usually, when a signal is raised, you can't really do anything beyond changing a specifically declared global variable.
  • Maggyero
    Maggyero almost 6 years
    "Therefore concurrent-safe (usually written thread-safe) implies re-entrant." This contradicts the "Thread-safe but not reentrant" Wikipedia example.
  • Maggyero
    Maggyero almost 6 years
    Two remarks: 1. "This way, the code becomes recursive-safe… But it is still not reentrant because of multithreading issues" So you are stating that reentrancy implies thread-safety, which contradicts the "Reentrant but not thread-safe" Wikipedia example. 2. Your last example with the recursive lock is multithread-safe but not single thread-safe, that is it is not signal-safe, which is the Wikipedia definition for reentrancy. So have you a reference?
  • Daniel
    Daniel over 2 years
    Why should most thread-safe code be re-entrant? If it is made thread-safe by synchronization (using mutex) then this cannot be re-entrant because locking the same mutex twice is a deadlock and in general the mutex is a global variable anyways.