Spinlocks, How Useful Are They?

c# c++ multithreading locking spinlock

24,862

Solution 1

It depends on what you're doing. In general application code, you'll want to avoid spinlocks.

In low-level stuff where you'll only hold the lock for a couple of instructions, and latency is important, a spinlock mat be a better solution than a lock. But those cases are rare, especially in the kind of applications where C# is typically used.

Solution 2

In C#, "Spin locks" have been, in my experience, almost always worse than taking a lock - it's a rare occurrence where spin locks will outperform a lock.

However, that's not always the case. .NET 4 is adding a System.Threading.SpinLock structure. This provides benefits in situations where a lock is held for a very short time, and being grabbed repeatedly. From the MSDN docs on Data Structures for Parallel Programming:

In scenarios where the wait for the lock is expected to be short, SpinLock offers better performance than other forms of locking.

Spin locks can outperform other locking mechanisms in cases where you're doing something like locking through a tree - if you're only having locks on each node for a very, very short period of time, they can out perform a traditional lock. I ran into this in a rendering engine with a multithreaded scene update, at one point - spin locks profiled out to outperform locking with Monitor.Enter.

Solution 3

For my realtime work, particularly with device drivers, I've used them a fair bit. It turns out that (when last I timed this) waiting for a sync object like a semaphore tied to a hardware interrupt chews up at least 20 microseconds, no matter how long it actually takes for the interrupt to occur. A single check of a memory-mapped hardware register, followed by a check to RDTSC (to allow for a time-out so you don't lock up the machine) is in the high nannosecond range (basicly down in the noise). For hardware-level handshaking that shouldn't take much time at all, it is really tough to beat a spinlock.

Solution 4

My 2c: If your updates satisfy some access criteria then they are good spinlock candidates:

fast, ie you will have time to acquire the spinlock, perform the updates and release the spinlock in a single thread quanta so that you don't get pre-empted while holding the spinlock
localized all data you update are in preferably one single page that is already loaded, you do not want a TLB miss while you holding the spinlock, and you definetely don't want an page fault swap read!
atomic you do not need any other lock to perform the operation, ie. never wait for locks under spinlock.

For anything that has any potential to yield, you should use a notified lock structure (events, mutex, semaphores etc).

Solution 5

One use case for spin locks is if you expect very low contention but are going to have a lot of them. If you don't need support for recursive locking, a spinlock can be implemented in a single byte, and if contention is very low then the CPU cycle waste is negligible.

For a practical use case, I often have arrays of thousands of elements, where updates to different elements of the array can safely happen in parallel. The odds of two threads trying to update the same element at the same time are very small (low contention) but I need one lock for every element (I'm going to have a lot of them). In these cases, I usually allocate an array of ubytes of the same size as the array I'm updating in parallel and implement spinlocks inline as (in the D programming language):

while(!atomicCasUbyte(spinLocks[i], 0, 1)) {}
    myArray[i] = newVal;
atomicSetUbyte(spinLocks[i], 0);

On the other hand, if I had to use regular locks, I would have to allocate an array of pointers to Objects, and then allocate a Mutex object for each element of this array. In scenarios such as the one described above, this is just plain wasteful.

View more solutions

24,862

Author by

Admin

Updated on July 09, 2022

Comments

Admin almost 2 years

How often do you find yourself actually using spinlocks in your code? How common is it to come across a situation where using a busy loop actually outperforms the usage of locks?
Personally, when I write some sort of code that requires thread safety, I tend to benchmark it with different synchronization primitives, and as far as it goes, it seems like using locks gives better performance than using spinlocks. No matter for how little time I actually hold the lock, the amount of contention I receive when using spinlocks is far greater than the amount I get from using locks (of course, I run my tests on a multiprocessor machine).

I realize that it's more likely to come across a spinlock in "low-level" code, but I'm interested to know whether you find it useful in even a more high-level kind of programming?
Steve Jessop over 14 years

In particular, where you have multiple cores, then spinning one core waiting for another core might be much, much faster then provoking a reschedule. Context switches can add up, if you have a lot of contention. Single core, spinlocks are less than compelling, since it amounts to abandoning the rest of your timeslice without actually letting the next thread get in until it expires, so it amounts to a really slow yield...
Zan Lynx over 14 years

And if your mutexes do not do this then you can do a spinlock + mutex of your own: spin X number of times, then lock.
Zuuum over 13 years

kernel mode lock mechanisms are much slower in comparison to user mode lock choices, sometimes up to 100 times! so, mutex etc. may not be an option for a really good implementation.
T.E.D. about 2 years

13 years later on the clarification: If its just the one check, 20us isn't really that much. So yes, for high-level code, its not likely to be worth the potential problems. However, in cases where there's actually multiple handshakes at this level that need to happen for every I/O, a few of these can really add up.