Mutex example / tutorial?

287,188

Solution 1

Here goes my humble attempt to explain the concept to newbies around the world: (a color coded version on my blog too)

A lot of people run to a lone phone booth (they don't have mobile phones) to talk to their loved ones. The first person to catch the door-handle of the booth, is the one who is allowed to use the phone. He has to keep holding on to the handle of the door as long as he uses the phone, otherwise someone else will catch hold of the handle, throw him out and talk to his wife :) There's no queue system as such. When the person finishes his call, comes out of the booth and leaves the door handle, the next person to get hold of the door handle will be allowed to use the phone.

A thread is : Each person
The mutex is : The door handle
The lock is : The person's hand
The resource is : The phone

Any thread which has to execute some lines of code which should not be modified by other threads at the same time (using the phone to talk to his wife), has to first acquire a lock on a mutex (clutching the door handle of the booth). Only then will a thread be able to run those lines of code (making the phone call).

Once the thread has executed that code, it should release the lock on the mutex so that another thread can acquire a lock on the mutex (other people being able to access the phone booth).

[The concept of having a mutex is a bit absurd when considering real-world exclusive access, but in the programming world I guess there was no other way to let the other threads 'see' that a thread was already executing some lines of code. There are concepts of recursive mutexes etc, but this example was only meant to show you the basic concept. Hope the example gives you a clear picture of the concept.]

With C++11 threading:

#include <iostream>
#include <thread>
#include <mutex>

std::mutex m;//you can use std::lock_guard if you want to be exception safe
int i = 0;

void makeACallFromPhoneBooth() 
{
    m.lock();//man gets a hold of the phone booth door and locks it. The other men wait outside
      //man happily talks to his wife from now....
      std::cout << i << " Hello Wife" << std::endl;
      i++;//no other thread can access variable i until m.unlock() is called
      //...until now, with no interruption from other men
    m.unlock();//man lets go of the door handle and unlocks the door
}

int main() 
{
    //This is the main crowd of people uninterested in making a phone call

    //man1 leaves the crowd to go to the phone booth
    std::thread man1(makeACallFromPhoneBooth);
    //Although man2 appears to start second, there's a good chance he might
    //reach the phone booth before man1
    std::thread man2(makeACallFromPhoneBooth);
    //And hey, man3 also joined the race to the booth
    std::thread man3(makeACallFromPhoneBooth);

    man1.join();//man1 finished his phone call and joins the crowd
    man2.join();//man2 finished his phone call and joins the crowd
    man3.join();//man3 finished his phone call and joins the crowd
    return 0;
}

Compile and run using g++ -std=c++0x -pthread -o thread thread.cpp;./thread

Instead of explicitly using lock and unlock, you can use brackets as shown here, if you are using a scoped lock for the advantage it provides. Scoped locks have a slight performance overhead though.

Solution 2

While a mutex may be used to solve other problems, the primary reason they exist is to provide mutual exclusion and thereby solve what is known as a race condition. When two (or more) threads or processes are attempting to access the same variable concurrently, we have potential for a race condition. Consider the following code

//somewhere long ago, we have i declared as int
void my_concurrently_called_function()
{
  i++;
}

The internals of this function look so simple. It's only one statement. However, a typical pseudo-assembly language equivalent might be:

load i from memory into a register
add 1 to i
store i back into memory

Because the equivalent assembly-language instructions are all required to perform the increment operation on i, we say that incrementing i is a non-atmoic operation. An atomic operation is one that can be completed on the hardware with a gurantee of not being interrupted once the instruction execution has begun. Incrementing i consists of a chain of 3 atomic instructions. In a concurrent system where several threads are calling the function, problems arise when a thread reads or writes at the wrong time. Imagine we have two threads running simultaneoulsy and one calls the function immediately after the other. Let's also say that we have i initialized to 0. Also assume that we have plenty of registers and that the two threads are using completely different registers, so there will be no collisions. The actual timing of these events may be:

thread 1 load 0 into register from memory corresponding to i //register is currently 0
thread 1 add 1 to a register //register is now 1, but not memory is 0
thread 2 load 0 into register from memory corresponding to i
thread 2 add 1 to a register //register is now 1, but not memory is 0
thread 1 write register to memory //memory is now 1
thread 2 write register to memory //memory is now 1

What's happened is that we have two threads incrementing i concurrently, our function gets called twice, but the outcome is inconsistent with that fact. It looks like the function was only called once. This is because the atomicity is "broken" at the machine level, meaning threads can interrupt each other or work together at the wrong times.

We need a mechanism to solve this. We need to impose some ordering to the instructions above. One common mechanism is to block all threads except one. Pthread mutex uses this mechanism.

Any thread which has to execute some lines of code which may unsafely modify shared values by other threads at the same time (using the phone to talk to his wife), should first be made acquire a lock on a mutex. In this way, any thread that requires access to the shared data must pass through the mutex lock. Only then will a thread be able to execute the code. This section of code is called a critical section.

Once the thread has executed the critical section, it should release the lock on the mutex so that another thread can acquire a lock on the mutex.

The concept of having a mutex seems a bit odd when considering humans seeking exclusive access to real, physical objects but when programming, we must be intentional. Concurrent threads and processes don't have the social and cultural upbringing that we do, so we must force them to share data nicely.

So technically speaking, how does a mutex work? Doesn't it suffer from the same race conditions that we mentioned earlier? Isn't pthread_mutex_lock() a bit more complex that a simple increment of a variable?

Technically speaking, we need some hardware support to help us out. The hardware designers give us machine instructions that do more than one thing but are guranteed to be atomic. A classic example of such an instruction is the test-and-set (TAS). When trying to acquire a lock on a resource, we might use the TAS might check to see if a value in memory is 0. If it is, that would be our signal that the resource is in use and we do nothing (or more accurately, we wait by some mechanism. A pthreads mutex will put us into a special queue in the operating system and will notify us when the resource becomes available. Dumber systems may require us to do a tight spin loop, testing the condition over and over). If the value in memory is not 0, the TAS sets the location to something other than 0 without using any other instructions. It's like combining two assembly instructions into 1 to give us atomicity. Thus, testing and changing the value (if changing is appropriate) cannot be interrupted once it has begun. We can build mutexes on top of such an instruction.

Note: some sections may appear similar to an earlier answer. I accepted his invite to edit, he preferred the original way it was, so I'm keeping what I had which is infused with a little bit of his verbiage.

Solution 3

The best threads tutorial I know of is here:

https://computing.llnl.gov/tutorials/pthreads/

I like that it's written about the API, rather than about a particular implementation, and it gives some nice simple examples to help you understand synchronization.

Solution 4

I stumbled upon this post recently and think that it needs an updated solution for the standard library's c++11 mutex (namely std::mutex).

I've pasted some code below (my first steps with a mutex - I learned concurrency on win32 with HANDLE, SetEvent, WaitForMultipleObjects etc).

Since it's my first attempt with std::mutex and friends, I'd love to see comments, suggestions and improvements!

#include <condition_variable>
#include <mutex>
#include <algorithm>
#include <thread>
#include <queue>
#include <chrono>
#include <iostream>


int _tmain(int argc, _TCHAR* argv[])
{   
    // these vars are shared among the following threads
    std::queue<unsigned int>    nNumbers;

    std::mutex                  mtxQueue;
    std::condition_variable     cvQueue;
    bool                        m_bQueueLocked = false;

    std::mutex                  mtxQuit;
    std::condition_variable     cvQuit;
    bool                        m_bQuit = false;


    std::thread thrQuit(
        [&]()
        {
            using namespace std;            

            this_thread::sleep_for(chrono::seconds(5));

            // set event by setting the bool variable to true
            // then notifying via the condition variable
            m_bQuit = true;
            cvQuit.notify_all();
        }
    );


    std::thread thrProducer(
        [&]()
        {
            using namespace std;

            int nNum = 13;
            unique_lock<mutex> lock( mtxQuit );

            while ( ! m_bQuit )
            {
                while( cvQuit.wait_for( lock, chrono::milliseconds(75) ) == cv_status::timeout )
                {
                    nNum = nNum + 13 / 2;

                    unique_lock<mutex> qLock(mtxQueue);
                    cout << "Produced: " << nNum << "\n";
                    nNumbers.push( nNum );
                }
            }
        }   
    );

    std::thread thrConsumer(
        [&]()
        {
            using namespace std;
            unique_lock<mutex> lock(mtxQuit);

            while( cvQuit.wait_for(lock, chrono::milliseconds(150)) == cv_status::timeout )
            {
                unique_lock<mutex> qLock(mtxQueue);
                if( nNumbers.size() > 0 )
                {
                    cout << "Consumed: " << nNumbers.front() << "\n";
                    nNumbers.pop();
                }               
            }
        }
    );

    thrQuit.join();
    thrProducer.join();
    thrConsumer.join();

    return 0;
}

Solution 5

The function pthread_mutex_lock() either acquires the mutex for the calling thread or blocks the thread until the mutex can be acquired. The related pthread_mutex_unlock() releases the mutex.

Think of the mutex as a queue; every thread that attempts to acquire the mutex will be placed on the end of the queue. When a thread releases the mutex, the next thread in the queue comes off and is now running.

A critical section refers to a region of code where non-determinism is possible. Often this because multiple threads are attempting to access a shared variable. The critical section is not safe until some sort of synchronization is in place. A mutex lock is one form of synchronization.

Share:
287,188
Nav
Author by

Nav

Updated on July 08, 2022

Comments

  • Nav
    Nav almost 2 years

    I'm new to multithreading, and was trying to understand how mutexes work. Did a lot of Googling but it still left some doubts of how it works because I created my own program in which locking didn't work.

    One absolutely non-intuitive syntax of the mutex is pthread_mutex_lock( &mutex1 );, where it looks like the mutex is being locked, when what I really want to lock is some other variable. Does this syntax mean that locking a mutex locks a region of code until the mutex is unlocked? Then how do threads know that the region is locked? [UPDATE: Threads know that the region is locked, by Memory Fencing ]. And isn't such a phenomenon supposed to be called critical section? [UPDATE: Critical section objects are available in Windows only, where the objects are faster than mutexes and are visible only to the thread which implements it. Otherwise, critical section just refers to the area of code protected by a mutex]

    In short, could you please help with the simplest possible mutex example program and the simplest possible explanation on the logic of how it works? I'm sure this will help plenty of other newbies.

    • Nav
      Nav about 13 years
    • San Jacinto
      San Jacinto about 13 years
      I don't mean this offensively, but what your last comment suggests to me is that we need less analogies and a better technical explanation of how a mutex works and why we need them.
    • Nav
      Nav about 13 years
      @San: No offence taken :) My comments were only meant to suggest that a newbie could get the shortest, clearest explanation of mutexes. Many analogies could get confusing for the newbie, so different analogies should be kept separately. The whole reason for me posting the ques and ans is because as a newbie, I found it a pain to read through long explanations and code samples. I wouldn't want anyone else to go through the pain.
    • Nav
      Nav over 10 years
      @Cory: If this answer could be improved, I'd be happy to take your suggestions. I'm just happy that a lot of other people have found it helpful. If it didn't help you, there are answers from other people too who have pointed to other mutex tutorials. Why be so negative?
  • Arsen Mkrtchyan
    Arsen Mkrtchyan about 13 years
    Is it guaranteed that exactly the next attempting thread will enter?
  • chrisaycock
    chrisaycock about 13 years
    @Arsen No guarantee. It's just a helpful analogy.
  • R.. GitHub STOP HELPING ICE
    R.. GitHub STOP HELPING ICE about 13 years
    pthread_mutex_lock cannot return if someone else holds the lock. It blocks in this case and that's the whole point. pthread_mutex_trylock is the function that will return if the lock is held.
  • Makis
    Makis about 13 years
    Yeah, I didn't realise at first what implementation this is.
  • Nav
    Nav about 13 years
    I agree it's definitely a good tutorial, but it's a lot of information on a single page and the programs are long. The question I posted is the mutex version of the "I have a dream" speech, where newbies would find a simple way to learn about mutexes and understand how the non-intuitive syntax works (this is one explanation that is lacking in all tutorials).
  • Nav
    Nav about 13 years
    @San: I'll be honest; Yes I do like the fact that you've tried your best to explain the details (with flow) to a complete newbie. BUT, (please don't misunderstand me) the intention of this post was to put the concept in a short explanation (coz the other answers pointed to long tutorials). I hope you wouldn't mind if I request you to copy your entire answer and post it as a separate answer? So that I can rollback and edit my answer to point to your answer.
  • San Jacinto
    San Jacinto about 13 years
    No prob. I rolled back what I added :)
  • Tom
    Tom about 13 years
    Nice explanation. What happens if the mutex is in a library: Implemented with boost::thread (for example) and someone launches a thread with a different threading library (pthread, say). Is there still only one door handle, or is there now two - a boost handle and an open thread handle?
  • San Jacinto
    San Jacinto about 13 years
    @Tom In that case, you shouldn't be accessing that mutex. Operations on it should be encapsulated so that whatever it is guarding are protected from such tomfoolery. If when you use the library's exposed API, the library is guaranteed to be thread-safe, then you are safe to include a distinctly different mutex to protect your own shared items. Otherwise, you are indeed adding a new door handle, as you've suggested.
  • San Jacinto
    San Jacinto about 13 years
    To extend my point, what you'd want to do is add another, larger room around the booth. The room may also contain a toilet and shower. Let's say that only 1 person is allowed in the room at once. You must design the room so that This room should have a door with a handle that protects entry to the room much like the phone booth. So now, even though you have extra mutexes, you can reuse the phone booth in any project. Another option would be to expose locking mechanisms for each device in the room and manage the locks in the room class. Either way, you wouldn't add new locks to the same object.
  • Nav
    Nav about 13 years
    Thank you so much, San. I've linked to your answer :) Actually, I had intended that you take my answer + your answer and post it as a separate answer, to keep the flow. I don't really mind if you re-use any part of my answer. We aren't doing this for ourselves anyway.
  • Nav
    Nav over 11 years
    Super! Thanks for posting. Though as I've mentioned before my purpose was only to simply explain the concept of a mutex. All other tutorials made it very difficult with the added concepts of producer consumer and condition variables etc, which made it very hard for me to understand what on earth was going on.
  • TimZaman
    TimZaman almost 9 years
    Thanks for taking the time to give an exellent answer.
  • Jonathan Wakely
    Jonathan Wakely over 8 years
    Your C++11 threading example is wrong. So is the TBB one, the clue is in the name scoped lock.
  • Nav
    Nav over 8 years
    I'm well aware of both, @Jonathan. You seemed to have missed the sentence I wrote (could've shown scoped locking by not using acquire and release - which also is exception safe -, but this is clearer. As for using scoped locking, it's upto the developer, depending on what kind of application they're building. This answer was meant to address the basic understanding of the mutex concept and not to get into all the complexities of it, so your comments and links are welcome but a bit out of scope of this tutorial.
  • Jonathan Wakely
    Jonathan Wakely over 8 years
    Why is using three lines instead of one clearer? A tutorial that shows a style that you should never use in real code is not a useful tutorial. Using scoped locks is simpler, more correct, and should be what is taught by default.
  • Nav
    Nav over 8 years
    Some of the content in the tutorial does need some change and I'll be changing it soon, but do remember that no matter how much of convenience smart pointers or scoped locks provide, there are applications people build, which need to run so fast that such constructs can actually cause a huge overhead on the application. This is from experience. One mans meat is another mans poison. I'd like to politely decline from commenting on any more counter-arguments you have.
  • Enlico
    Enlico over 4 years
    @Nav, the as shown here link is broken.
  • Sigma Octantis
    Sigma Octantis almost 4 years
    Link to the blog is also broken
  • Amit Vikram Singh
    Amit Vikram Singh about 2 years
    @San What do you mean by: "the TAS sets the location to something other than 0". Shouldn't the location be set to 0?