In what situation do you use a semaphore over a mutex in C++?

36,405

Solution 1

Boost.Thread has mutexes and condition variables. Purely in terms of functionality, semaphores are therefore redundant[*], although I don't know if that's why they're omitted.

Semaphores are a more basic primitive, simpler, and possibly implemented to be faster, but don't have priority-inversion avoidance. They're arguably harder to use than condition variables, because they require the client code to ensure that the number of posts "matches" the number of waits in some appropriate way. With condition variables it's easy to tolerate spurious posts, because nobody actually does anything without checking the condition.

Read vs. write resources is a red herring IMO, it has nothing to do with the difference between a mutex and a semaphore. If you use a counting semaphore, you could have a situation where multiple threads are concurrently accessing the same resource, in which case it would presumably have to be read-only access. In that situation, you might be able to use shared_mutex from Boost.Thread instead. But semaphores aren't "for" protecting resources in the way mutexes are, they're "for" sending a signal from one thread to another. It's possible to use them to control access to a resource.

That doesn't mean that all uses of semaphores must relate to read-only resources. For example, you can use a binary semaphore to protect a read/write resource. Might not be a good idea, though, since a mutex often gives you better scheduling behaviour.

[*] Here's roughly how you implement a counting semaphore using a mutex and a condition variable. To implement a shared semaphore of course you need a shared mutex/condvar:

struct sem {
    mutex m;
    condvar cv;
    unsigned int count;
};

sem_init(s, value)
    mutex_init(s.m);
    condvar_init(s.cv);
    count = value;

sem_wait(s)
    mutex_lock(s.m);
    while (s.count <= 0) {
        condvar_wait(s.cv, s.m);
    }
    --s.count;
    mutex_unlock(s.m);

sem_post(s)
    mutex_lock(s.m);
    ++s.count;
    condvar_broadcast(s.cv)
    mutex_unlock(s.m);

Therefore, anything you can do with semaphores, you can do with mutexes and condition variables. Not necessarily by actually implementing a semaphore, though.

Solution 2

The typical use case for a mutex (allowing only one thread access to a resource at any time) is far more common than the typical uses if a semaphore. But a semaphore is actually the more general concept: A mutex is (almost) a special case of a semaphore.

Typical applications would be: You don't want to create more than (e.g.) 5 database connections. No matter how many worker threads there are, they have to share these 5 connections. Or, if you run on a N-core machine, you might want to make sure that certain CPU/memory-intensive tasks don't run in more than N threads at the same time (because that would only reduce throughput due to context switches and cache thrashing effects). You might even want to limit the number of parallel CPU/memory intensive tasks to N-1, so the rest of the system doesn't starve. Or imagine a certain task needs a lot of memory, so running more than N instances of that task at the same time would lead to paging. You could use a semaphore here, to make sure that no more than N instances of this particular task run at the same time.

EDIT/PS: From your question "This is only possible if those threads are only reading the resource but not writing. Is this correct?" and your comment, it seems to me as if you're thinking of a resource as a variable or a stream, that can be read or written and that can only be written to by one thread at a time. Don't. This is misleading in this context.

Think of resources like "water". You can use water to wash your dishes. I can use water to wash my dishes at the same time. We don't need any kind of synchronization for that, because there is enough water for both of us. We don't necessarily use the same water. (And you can't "read" or "write" water.) But the total amount of water is finite. So it's not possible for any number of parties to wash their dishes at the same time. This kind of synchronization is done with a semaphore. Only usually not with water but with other finite resources like memory, disk space, IO throughput or CPU cores.

Solution 3

The essence of the difference between a mutex and a semaphore has to do with the concept of ownership. When a mutex is taken, we think of that thread as owning the mutex and that same thread must later release the mutex back to release the resource.

For a semaphore, think of taking the semaphore as consuming the resource, but not actually taking ownership of it. This is generally referred to as the semaphore being "empty" rather than owned by a thread. The feature of the semaphore is then that a different thread can "fill" the semaphore back to "full" state.

Therefore, mutexes are usually used for the concurrency protection of resources (ie: MUTual EXlusion) while semaphores are used for signaling between threads (like semaphore flags signaling between ships). A mutex by itself can't really be used for signaling, but semaphores can. So, selecting one over the other depends on what you are trying to do.

See another one of my answers here for more discussion on a related topic covering the distinction between recursive and non-recursive mutexes.

Solution 4

To control access to a limited number of resources being shared by multiple threads (either inter- or intra-process).

In our application, we had a very heavy resource and that we did not want to allocate one for each of the M worker threads. Since a worker thread needed the resource for just one small part of their job, we rarely were using more then a couple of the resources simultaneously.

So, we allocated N of those resources and put them behind a semaphore initialized to N. When more then N threads were trying to use the resource, they would just block until one was available.

Solution 5

I feel like there is no simple way to REALLY answer your question without disregarding some important information about semaphores. People have written many books about semaphores, so any one or two paragraph answer is a disservice. A popular book is The Little Book of Semaphores... for those who don't like big books :).

Here is a decent lengthy article which goes into a LOT of the details on how semaphores are used and how they're intended to be used.

Update:
Dan pointed out some mistakes in my examples, I'll leave it with the references which offer MUCH better explanations than mine :).

Here are the references showing the RIGHT ways one should use a semaphore:
1. IBM Article
2. University of Chicago Class Lecture
3. The Netrino article I originally posted.
4. The "sell tickets" paper + code.

Share:
36,405
jasonline
Author by

jasonline

C++ Developer

Updated on January 21, 2020

Comments

  • jasonline
    jasonline over 4 years

    Throughout the resources I've read about multithreading, mutex is more often used and discussed compared to a semaphore. My question is when do you use a semaphore over a mutex? I don't see semaphores in Boost thread. Does that mean semaphores no longer used much these days?

    As far as I've understand, semaphores allow a resource to be shared by several threads. This is only possible if those threads are only reading the resource but not writing. Is this correct?

  • jasonline
    jasonline about 14 years
    But if you use semaphore to allow N instances of that task, does it mean that those N instances are already synchronized like what a mutex does? Like for example, if those tasks are for writing to a common resource, does the semaphore internally handle this such that there will be no race conditions between those N instances/tasks?
  • Niki
    Niki about 14 years
    I think you misunderstood the term. You can use mutexes to replace semaphores or events, but generally you shouldn't. Understand the primitives and use them where appropriate.
  • Kiril
    Kiril about 14 years
    @jasonline A semaphore does not internally handle race conditions between tasks, if that was the case then everybody would be using semaphores and nobody would have trouble with multithreading :)... a semaphore can do MANY things, a semaphore can even do the work of a mutex ^^ hehe. Check out my answer for more details.
  • Kiril
    Kiril about 14 years
    The intended use of a semaphore is to signal between threads... and a semaphore cannot solve a multiple identical resource problem on its own, here is the same article I provided in my answer to explain why that's the case: netrino.com/node/202
  • jasonline
    jasonline about 14 years
    @nikie: From your latest update about the example on water... what then should a resource be for a semaphore to be used?
  • jasonline
    jasonline about 14 years
    @Lirik: I read the article. But can you give one example of a resource in an application where semaphore is used to control the number of instances it shares? I still don't understand what or how the resource can be shared and whether a race condition will occur.
  • Steve Jessop
    Steve Jessop about 14 years
    @jasonline: for example suppose you know that you have 2 CPU cores, and you have 27 parallel tasks to run. If you run more than 2 at once, they'll time-share OK through the scheduler and perhaps finish roughly together. If you create a semaphore with a count of 2, and have each thread wait on the semaphore before starting and post after finishing, then 2 tasks at a time will run, and whatever runs first will finish as soon as possible. That might be more useful, depending on the task. You've used a semaphore to control access to a resource, but cores aren't a "read-only" resource.
  • jasonline
    jasonline about 14 years
    @Steve: Thanks for your example. So you're saying to run two threads at a time and control them through a semaphore. Is my understanding correct?
  • jasonline
    jasonline about 14 years
    @Steve: Yes, I can understand the mutexes and condition variables from Boost. I'm just not sure how I could relate it with semaphores or how I could implement mutexes/condition variables to function like a semaphore (?)
  • jasonline
    jasonline about 14 years
    @Steve: Your statement: "If you use a counting semaphore, you could have a situation where multiple threads are concurrently accessing the same resource, in which case it would presumably have to be read-only access." You're using the semaphore just to limit the number of threads accessing it right? And if you're not using a semaphore, you don't really need a mutex also because it's only read access and it's safe for multiple threads to access the resource. Correct?
  • jasonline
    jasonline about 14 years
    @R Samuel: I'm interested to know what is that resource?
  • R Samuel Klatchko
    R Samuel Klatchko about 14 years
    @jasonline - an instance of the Autonomy/Verity text extraction engine. It's a library that can be used to extract the basic text from a PDF, Word, etc. file.
  • R Samuel Klatchko
    R Samuel Klatchko about 14 years
    @Lirik - yes, you need something more then just a semaphore, but the semaphore provides an key property. When the resources being guarded are identical and a thread wants to block until any single resource is available, the semaphore allows you to efficiently do that.
  • Steve Jessop
    Steve Jessop about 14 years
    @jasonline: To the first question, I've edited. To the second question yes, read-only vs. read-write access usually is entirely about how many threads are accessing the resource at once: at most 1, or more than that, because concurrent read-only access to resources is usually safe whereas concurrent access involving a write isn't.
  • Steve Jessop
    Steve Jessop about 14 years
    Yes, that's right. The reason you only want 2 threads at a time is nothing to do with a read-only vs read-write resource, just that in this hypothetical example, you want to get something finished ASAP.
  • Kiril
    Kiril about 14 years
    @Samuel unfortunately the key is not unique, so if we're talking about a key to a hotel room... the key will be able to open ANY room, even a room which already has occupants.
  • Kiril
    Kiril about 14 years
    @jasonline as other people have pointed out: a semaphore does not protect against race conditions... it is COMPLETELY different from a mutex. A semaphore does not tell you who owns the resource or who should actually access it, it only tells you that there are available instances of that resource (keys to hotel rooms, ticket counters, etc.).
  • mghie
    mghie about 14 years
    @Lirik: 60 guests waiting to sleep in one of 20 hotel rooms do of course wait for one of 20 different keys. However without using a semaphore you would need to either have all 60 guests call WaitForMultipleObjects() on an array of 20 mutexes for the keys (won't scale), or create some other complicated solution. There is no OS primitive that will solve this problem on its own.
  • Dan
    Dan about 14 years
    Good example of how a semaphore can be used in a way that is not "signaling from one thread to another".
  • Kiril
    Kiril about 14 years
    @mghie I guess my point was that the semaphore does not give you a way to define ownership... the key itself has to define ownership in some way (i.e. it can only open room 205). But the same thing can be achieved with a blocking queue (although not an OS primitive): the key is taken out of the queue and then returned to the queue... threads are limited by the number of keys circulating in the queue.
  • Dan
    Dan about 14 years
    @Lirik: The netrino article is good. I disagree with the rest of your answer. There is a simple way to explain the difference: a mutex is used to protect a critical section, a semaphore is used to signal from one task to another that an event has occurred or a condition has been met. Your code example uses a semaphore where a mutex should be used.
  • Dan
    Dan about 14 years
    Also, your ticket-counter solution is flawed. Say three clerks are working and no others have showed up to work, then two clerks leave. You say when a clerk leaves "it will signal via the semaphore which ticket counter is being vacated." A semaphore doesn't store this info, it would be in a separate variable. Better to use a queue/stack to store the list of empty ticket counters, in case there is more than one. When a clerk leaves, its ticket counter number goes on the queue. When a clerk arrives, it gets a counter from the queue, or blocks (condition variable) if the queue is empty.
  • Kiril
    Kiril about 14 years
    @Dan Good point! I'll just leave the references... they explain it MUCH better than I.
  • mghie
    mghie about 14 years
    @Lirik: A blocking queue is certainly a great alternative. But we're no longer speaking about mutex vs. semaphore now.
  • R Samuel Klatchko
    R Samuel Klatchko about 14 years
    @Lirik - as I said in my previous comment you need more then a semaphore. The semaphore isn't the room key, it's the reservation for a room. First you get your reservation (via the semaphore), and then you get your room key (via some other technique).
  • R Samuel Klatchko
    R Samuel Klatchko about 14 years
    @Lirik - one last thought on a semaphore based solution vs a blocking queue solution. Both of them require two synchronization primitives (my solution requires one counting semaphore for efficient reservations and one mutex to control access to the individual resources - a blocking queue requires a mutex and a condition variable/event to implement). In the end, they both are pretty similar.
  • Steve Jessop
    Steve Jessop about 14 years
    @Dan: IMO it is still "signalling from one thread to another". You create the semaphore with 5 slots. Posting the semaphore says "I have just freed up one of the slots". Waiting on the semaphore consumes a slot if possible, otherwise the thread waits until it is "signalled". There's a lot you can do with a 1-bit message, but the point about semaphores is that's all they do. Mutexes not only do this, they also have an owner thread, which means the scheduler can do things for mutexes which it cannot do for locks implemented with semaphores, or this 5-slot processing throttle.
  • Steve Jessop
    Steve Jessop about 14 years
    I say "signalled" in inverted commas, because of course signals are a particular thing, commonly (but not necessarily) used for IPC, with a more specific meaning than just the general English word. But in the general English word, a semaphore just tracks how many signals/messages/slots/posts have been sent and how many have been consumed, and delivers them to consumers when available.
  • Dan
    Dan about 14 years
    @Steve: you're right, it is still "signaling". What struck me as different is that it wasn't one specific thread signaling to another specific thread, such as "the Built-in-test thread detected a fault condition and gave the Danger semaphore, which the Shutdown-reactor thread had been blocking on." That's how I normally see semaphores used. Anything more complicated than that, especially multiple-producer/consumer, usually requires more sophisticated IPC (condition variable, blocking-queue, mutex-protected object). I found this example novel because you can get away with just a semaphore.