What is the difference between atomic and critical in OpenMP?

90,159

Solution 1

The effect on g_qCount is the same, but what's done is different.

An OpenMP critical section is completely general - it can surround any arbitrary block of code. You pay for that generality, however, by incurring significant overhead every time a thread enters and exits the critical section (on top of the inherent cost of serialization).

(In addition, in OpenMP all unnamed critical sections are considered identical (if you prefer, there's only one lock for all unnamed critical sections), so that if one thread is in one [unnamed] critical section as above, no thread can enter any [unnamed] critical section. As you might guess, you can get around this by using named critical sections).

An atomic operation has much lower overhead. Where available, it takes advantage on the hardware providing (say) an atomic increment operation; in that case there's no lock/unlock needed on entering/exiting the line of code, it just does the atomic increment which the hardware tells you can't be interfered with.

The upsides are that the overhead is much lower, and one thread being in an atomic operation doesn't block any (different) atomic operations about to happen. The downside is the restricted set of operations that atomic supports.

Of course, in either case, you incur the cost of serialization.

Solution 2

In OpenMP, all the unnamed critical sections are mutually exclusive.

The most important difference between critical and atomic is that atomic can protect only a single assignment and you can use it with specific operators.

Solution 3

Critical section:

  • Ensures serialisation of blocks of code.
  • Can be extended to serialise groups of blocks with proper use of "name" tag.

  • Slower!

Atomic operation:

  • Is much faster!

  • Only ensures the serialisation of a particular operation.

Solution 4

The fastest way is neither critical nor atomic. Approximately, addition with critical section is 200 times more expensive than simple addition, atomic addition is 25 times more expensive then simple addition.

The fastest option (not always applicable) is to give each thread its own counter and make reduce operation when you need total sum.

Solution 5

The limitations of atomic are important. They should be detailed on the OpenMP specs. MSDN offers a quick cheat sheet as I wouldn't be surprised if this will not change. (Visual Studio 2012 has an OpenMP implementation from March 2002.) To quote MSDN:

The expression statement must have one of the following forms:

xbinop=expr

x++

++x

x--

--x

In the preceding expressions: x is an lvalue expression with scalar type. expr is an expression with scalar type, and it does not reference the object designated by x. binop is not an overloaded operator and is one of +, *, -, /, &, ^, |, <<, or >>.

I recommend to use atomic when you can and named critical sections otherwise. Naming them is important; you'll avoid debugging headaches this way.

Share:
90,159
codereviewanskquestions
Author by

codereviewanskquestions

Updated on July 11, 2022

Comments

  • codereviewanskquestions
    codereviewanskquestions almost 2 years

    What is the difference between atomic and critical in OpenMP?

    I can do this

    #pragma omp atomic
    g_qCount++;
    

    but isn't this same as

    #pragma omp critical
    g_qCount++;
    

    ?

  • kynan
    kynan over 9 years
    This would have better been a comment (or an edit) of the previous answer.
  • Michał Miszczyszyn
    Michał Miszczyszyn over 8 years
    But this answer is very readable and would be a great sum up of the first answer
  • Klaas van Gend
    Klaas van Gend about 8 years
    I disagree with all numbers you mention in your explanation. Assuming x86_64, the atomic operation will have a few cycle overhead (synchronizing a cache line) on the cost of roughly a cycle. If you would have a ''true sharing'' cost otherwise, the overhead is nihil. A critical section incurs the cost of a lock. Depending on whether the lock is already taken or not, the overhead is roughly 2 atomic instructions OR two runs of the scheduler and the sleep time - that will usually be significantly more than 200x.
  • Dan R
    Dan R almost 8 years
    "you could loose portability" - I'm not sure this is true. The standard (version 2.0) specifies which atomic operations are allowed (basically things like ++ and *=) and that if they aren't supported in hardware, they might be replaced by critical sections.
  • Jonathan Dursi
    Jonathan Dursi almost 8 years
    @DanRoche: Yes, you're quite right. I don't think that statement was ever correct, I'll correct it now.
  • pooria
    pooria almost 8 years
    This are not the all , we have other advanced atomic directives like : #pragma omp aromic update(or read , upate,write , capture ) so it allows us to have some other beneficial statement
  • Giox79
    Giox79 over 6 years
    A few days ago I followed a OpenMP tutorial, and as far as I understood, there is a difference in the two different code. That is the result can differ becaus the critical section assures that the instruction is executed by a thread a time, however it is possible that the instruction: g_qCount = g_qCount+1; for thread 1 simply stores the g_qCount result only in the writebuffer not in RAM memory, and when thread 2 fetch the value g_qCount, it simply read the one in the RAM, not in the writebuffer. Atomic instruction assures the instruction flushed the data to memory
  • jcsahnwaldt Reinstate Monica
    jcsahnwaldt Reinstate Monica over 5 years
    That's just wrong. Please don't talk about stuff you don't understand.
  • Noureddine
    Noureddine over 3 years
    The option you are suggesting might lead to a huge request on memory that we might don't have at our disposal. For instance if I'm working on data of 1000x1000x1000 cells and that I'm working with 10 or 100 threads, the internal copies created for each thread will for sure saturate the RAM.