Real World Examples of read-write in concurrent software

multithreading concurrency locking semaphore

10,152

Solution 1

What kind of locks are used depends on how the data is being accessed by multiple threads. If you can fine tune the use case, you can sometimes eliminate the need for exclusive locks completely.

An exclusive lock is needed only if your use case requires that the shared data must be 100% exact all the time. This is the default that most developers start with because that's how we think about data normally.

However, if what you are using the data for can tolerate some "looseness", there are several techniques to share data between threads without the use of exclusive locks on every access.

For example, if you have a linked list of data and if your use of that linked list would not be upset by seeing the same node multiple times in a list traversal and would not be upset if it did not see an insert immediately after the insert (or similar artifacts), you can perform list inserts and deletes using atomic pointer exchange without the need for a full-stop mutex lock around the insert or delete operation.

Another example: if you have an array or list object that is mostly read from by threads and only occasionally updated by a master thread, you could implement lock-free updates by maintaining two copies of the list: one that is "live" that other threads can read from and another that is "offline" that you can write to in the privacy of your own thread. To perform an update, you copy the contents of the "live" list into the "offline" list, perform the update to the offline list, and then swap the offline list pointer into the live list pointer using an atomic pointer exchange. You will then need some mechanism to let the readers "drain" from the now offline list. In a garbage collected system, you can just release the reference to the offline list - when the last consumer is finished with it, it will be GC'd. In a non-GC system, you could use reference counting to keep track of how many readers are still using the list. For this example, having only one thread designated as the list updater would be ideal. If multiple updaters are needed, you will need to put a lock around the update operation, but only to serialize updaters - no lock and no performance impact on readers of the list.

All the lock-free resource sharing techniques I'm aware of require the use of atomic swaps (aka InterlockedExchange). This usually translates into a specific instruction in the CPU and/or a hardware bus lock (lock prefix on a read or write opcode in x86 assembler) for a very brief period of time. On multiproc systems, atomic swaps may force a cache invalidation on the other processors (this was the case on dual proc Pentium II) but I don't think this is as much of a problem on current multicore chips. Even with these performance caveats, lock-free runs much faster than taking a full-stop kernel event object. Just making a call into a kernel API function takes several hundred clock cycles (to switch to kernel mode).

Examples of real-world scenarios:

producer/consumer workflows. Web service receives http requests for data, places the request into an internal queue, worker thread pulls the work item from the queue and performs the work. The queue is read/write and has to be thread safe.
Data shared between threads with change of ownership. Thread 1 allocates an object, tosses it to thread 2 for processing, and never wants to see it again. Thread 2 is responsible for disposing the object. The memory management system (malloc/free) must be thread safe.
File system. This is almost always an OS service and already fully thread safe, but it's worth including in the list.
Reference counting. Releases the resource when the number of references drops to zero. The increment/decrement/test operations must be thread safe. These can usually be implemented using atomic primitives instead of full-stop kernal mutex locks.

Solution 2

Most real world, concurrent software, has some form of requirement for synchronization at some level. Often, better written software will take great pains to reduce the amount of locking required, but it is still required at some point.

For example, I often do simulations where we have some form of aggregation operation occurring. Typically, there are ways to prevent locking during the simulation phase itself (ie: use of thread local state data, etc), but the actual aggregation portion typically requires some form of lock at the end.

Luckily, this becomes a lock per thread, not per unit of work. In my case, this is significant, since I'm typically doing operations on hundreds of thousands or millions of units of work, but most of the time, it's occuring on systems with 4-16 PEs, which means I'm usually restricting to a similar number of units of execution. By using this type of mechanism, you're still locking, but you're locking between tens of elements instead of potentially millions.

10,152

Author by

Richard Fabian

I'm a games programmer. I do rendering, architecture, animation, tools, gameplay, networking, and all sorts of other odd jobs such as management, and the occasional bit of hardware development. I'm writing games at work and at coding all sorts of things at home. Currently working in the North, at Team17.

Updated on June 04, 2022

Comments

Richard Fabian almost 2 years

I'm looking for real world examples of needing read and write access to the same value in concurrent systems.

In my opinion, many semaphores or locks are present because there's no known alternative (to the implementer,) but do you know of any patterns where mutexes seem to be a requirement?

In a way I'm asking for candidates for the standard set of HARD problems for concurrent software in the real world.