Locking C++11 std::unique_lock causes deadlock exception

13,311

Foreword: An important thing to understand with condition variables is that they can be subject to random, spurious wake ups. In other words, a CV can exit from wait() without anyone having called notify_*() first. Unfortunately there is no way to distinguish such a spurious wake up from a legitimate one, so the only solution is to have an additional resource (at the very least a boolean) so that you can tell whether the wake up condition is actually met.

This additional resource should be guarded by a mutex too, usually the very same you use as a companion for the CV.


The typical usage of a CV/mutex pair is as follows:

std::mutex mutex;
std::condition_variable cv;
Resource resource;

void produce() {
    // note how the lock only protects the resource, not the notify() call
    // in practice this makes little difference, you just get to release the
    // lock a bit earlier which slightly improves concurrency
    {
        std::lock_guard<std::mutex> lock(mutex); // use the lightweight lock_guard
        make_ready(resource);
    }
    // the point is: notify_*() don't require a locked mutex
    cv.notify_one(); // or notify_all()
}

void consume() {
    std::unique_lock<std::mutex> lock(mutex);
    while (!is_ready(resource))
        cv.wait(lock);
    // note how the lock still protects the resource, in order to exclude other threads
    use(resource);
}

Compared to your code, notice how several threads can call produce()/consume() simultaneously without worrying about a shared unique_lock: the only shared things are mutex/cv/resource and each thread gets its own unique_lock that forces the thread to wait its turn if the mutex is already locked by something else.

As you can see, the resource can't really be separated from the CV/mutex pair, which is why I said in a comment that your wrapper class wasn't really fitting IMHO, since it indeed tries to separate them.

The usual approach is not to make a wrapper for the CV/mutex pair as you tried to, but for the whole CV/mutex/resource trio. Eg. a thread-safe message queue where the consumer threads will wait on the CV until the queue has messages ready to be consumed.


If you really want to wrap just the CV/mutex pair, you should get rid of your lock()/release() methods which are unsafe (from a RAII point of view) and replace them with a single lock() method returning a unique_ptr:

std::unique_ptr<std::mutex> lock() {
    return std::unique_ptr<std::mutex>(cMutex);
}

This way you can use your Cond wrapper class in rather the same way as what I showed above:

Cond cond;
Resource resource;

void produce() {
    {
        auto lock = cond.lock();
        make_ready(resource);
    }
    cond.notify(); // or notifyAll()
}

void consume() {
    auto lock = cond.lock();
    while (!is_ready(resource))
        cond.wait(lock);
    use(resource);
}

But honestly I'm not sure it's worth the trouble: what if you want to use a recursive_mutex instead of a plain mutex? Well, you'd have to make a template out of your class so that you can choose the mutex type (or write a second class altogether, yay for code duplication). And anyway you don't gain much since you still have to write pretty much the same code in order to manage the resource. A wrapper class only for the CV/mutex pair is too thin a wrapper to be really useful IMHO. But as usual, YMMV.

Share:
13,311
realh
Author by

realh

Updated on June 21, 2022

Comments

  • realh
    realh almost 2 years

    I'm trying to use a C++11 std::condition_variable, but when I try to lock the unique_lock associated with it from a second thread I get an exception "Resource deadlock avoided". The thread that created it can lock and unlock it, but not the second thread, even though I'm pretty sure the unique_lock shouldn't be locked already at the point the second thread tries to lock it.

    FWIW I'm using gcc 4.8.1 in Linux with -std=gnu++11.

    I've written a wrapper class around the condition_variable, unique_lock and mutex, so nothing else in my code has direct access to them. Note the use of std::defer_lock, I already fell in to that trap :-).

    class Cond {
    private:
        std::condition_variable cCond;
        std::mutex cMutex;
        std::unique_lock<std::mutex> cULock;
    public:
        Cond() : cULock(cMutex, std::defer_lock)
        {}
    
        void wait()
        {
            std::ostringstream id;
            id << std::this_thread::get_id();
            H_LOG_D("Cond %p waiting in thread %s", this, id.str().c_str());
            cCond.wait(cULock);
            H_LOG_D("Cond %p woke up in thread %s", this, id.str().c_str());
        }
    
        // Returns false on timeout
        bool waitTimeout(unsigned int ms)
        {
            std::ostringstream id;
            id << std::this_thread::get_id();
            H_LOG_D("Cond %p waiting (timed) in thread %s", this, id.str().c_str());
            bool result = cCond.wait_for(cULock, std::chrono::milliseconds(ms))
                    == std::cv_status::no_timeout;
            H_LOG_D("Cond %p woke up in thread %s", this, id.str().c_str());
            return result;
        }
    
        void notify()
        {
            cCond.notify_one();
        }
    
        void notifyAll()
        {
            cCond.notify_all();
        }
    
        void lock()
        {
            std::ostringstream id;
            id << std::this_thread::get_id();
            H_LOG_D("Locking Cond %p in thread %s", this, id.str().c_str());
            cULock.lock();
        }
    
        void release()
        {
            std::ostringstream id;
            id << std::this_thread::get_id();
            H_LOG_D("Releasing Cond %p in thread %s", this, id.str().c_str());
            cULock.unlock();
        }
    };
    

    My main thread creates a RenderContext, which has a thread associated with it. From the main thread's point of view, it uses the Cond to signal the rendering thread to perform an action and can also wait on the COnd for the rendering thread to complete that action. The rendering thread waits on the Cond for the main thread to send rendering requests, and uses the same Cond to tell the main thread it's completed an action if necessary. The error I'm getting occurs when the rendering thread tries to lock the Cond to check/wait for render requests, at which point it shouldn't be locked at all (because the main thread is waiting on it), let alone by the same thread. Here's the output:

    DEBUG: Created window
    DEBUG: OpenGL 3.0 Mesa 9.1.4, GLSL 1.30
    DEBUG: setScreen locking from thread 140564696819520
    DEBUG: Locking Cond 0x13ec1e0 in thread 140564696819520
    DEBUG: Releasing Cond 0x13ec1e0 in thread 140564696819520
    DEBUG: Entering GLFW main loop
    DEBUG: requestRender locking from thread 140564696819520
    DEBUG: Locking Cond 0x13ec1e0 in thread 140564696819520
    DEBUG: requestRender waiting
    DEBUG: Cond 0x13ec1e0 waiting in thread 140564696819520
    DEBUG: Running thread 'RenderThread' with id 140564575180544
    DEBUG: render thread::run locking from thread 140564575180544
    DEBUG: Locking Cond 0x13ec1e0 in thread 140564575180544
    terminate called after throwing an instance of 'std::system_error'
      what():  Resource deadlock avoided
    

    To be honest I don't really understand what a unique_lock is for and why condition_variable needs one instead of using a mutex directly, so that's probably the cause of the problem. I can't find a good explanation of it online.

  • realh
    realh almost 11 years
    Thanks for the detailed answer. But don't you have to pass a reference to your unique_lock when calling cv.wait()?
  • syam
    syam almost 11 years
    Whoops you're right, looks like I got a bit carried away. :) Fixing this right now.
  • realh
    realh almost 11 years
    The reason I used a wrapper class is because I'm writing some portable framework code. I was originally going to use SDL's threading API on Windows and Linux, and in Android either leave all thread management to Java or use pthreads. I wasn't 100% confident that C++11 support is stable on every platform I might like to support, but hopefully it should be OK.
  • realh
    realh almost 11 years
    I've already switched to RAII, but wrapped the API a different way. I used typedef std::unique_lock<std::mutex> CondLock and added a cast operator from Cond to std::mutex & so I can create a CondLock with a Cond as its constructor argument. I think your idea of using a method in Cond to create a CondLock is more elegant, so I'll probably change that.
  • syam
    syam almost 11 years
    Yeah I can understand your concern for C++11 support on various platforms. I'm lucky enough to work only with GCC nowadays, but even then I'm held back (can't get a 4.8 cross-compiler to work so I'm actually stuck with 4.7). Writing proper, standard-conformant cross-platform C++11's gotta be a pain right now, until the dust settles down...
  • syam
    syam almost 11 years
    As to your CondLock class being able to be constructed from a Cond, or Cond returning a CondLock from a lock() method, the only real difference is that with the latter you can use auto which is nice syntactic sugar.
  • realh
    realh almost 11 years
    By the way, the last time I used threading with a CV was in Java, which does require an object to be "synchronized" (locked) before calling notify*(), so that's probably what made me assume that C++11 CVs had the same requirement. Python also requires that notify() only be called while locked. According to an online Linux man page pthread_cond_signal/broadcast don't require locking but it's recommended. The man page on my own PC (Debian unstable) doesn't explicitly say, but advises you should lock to avoid the possibility of a waiting thread missing a notify between locking and calling wait.
  • realh
    realh almost 11 years
  • Maverik
    Maverik over 10 years
    @syam Your solution is fine and works great. Referring to the first snippet of code, what if I have two produce functions (for two different input types), I'm implementing a mechanism for avoiding starvation of the reader and I want to avoid duplication of code? In other words, I would like to have a function which takes the lock and wait on a condition if a reader is ready for reading. The two produce functions call this function. This required to have the lock as a class member.
  • syam
    syam over 10 years
    @Maverik I don't think comments are the right place to reply to your question, since it is quite specific. You'd better make a new question about it so that you can explain in more detail and get adequate answers.