Parallelization: pthreads or OpenMP?

multithreading optimization pthreads openmp

20,417

Solution 1

It basically boils down to what level of control you want over your parallelization. OpenMP is great if all you want to do is add a few #pragma statements and have a parallel version of your code quite quickly. If you want to do really interesting things with MIMD coding or complex queueing, you can still do all this with OpenMP, but it is probably a lot more straightforward to use threading in that case. OpenMP also has similar advantages in portability in that a lot of compilers for different platforms support it now, as with pthreads.

So you're absolutely correct - if you need fine-tuned control over your parallelization, use pthreads. If you want to parallelize with as little work as possible, use OpenMP.

Whichever way you decide to go, good luck!

Solution 2

One other reason: the OpenMP is task-based, Pthreads is thread based. It means that OpenMP will allocate the same number of threads as number of cores. So you will get scalable solution. It is not so easy task to do it using raw threads.

The second opinion: OpenMP provides reduction features: when you need to compute partial results in threads and combine them. You can implement it just using single line of code. But using raw threads you should do more job.

Just think about your requirements and try to understand: is OpenMP enough for you? You will save lots of time.

Solution 3

OpenMP requires a compiler that supports it, and works with pragmas. The advantage to this is that when compiling without OpenMP-support (e.g. PCC or Clang/LLVM as of now), the code will still compile. Also, have a look at what Charles Leiserson wrote about DIY multithreading.

Pthreads is a POSIX standard (IEEE POSIX 1003.1c) for libraries, while OpenMP specifications are to be implemented on compilers; that being said, there are a variety of pthread implementations (e.g. OpenBSD rthreads, NPTL), and a number of compilers that support OpenMP (e.g. GCC with the -fopenmp flag, MSVC++ 2008).

Pthreads are only effective for parallelization when multiple processors are available, and only when the code is optimized for the number of processors available. Code for OpenMP is more-easily scalable as a result. You can mix code that compiles with OpenMP with code using pthreads, too.

Solution 4

You're question is similar to the question "Should I program C or assembly", C being OpenMP and assembly being pthreads.

With pthreads you can do much better parallelisation, better meaning very tightly adjusted to your algorithm and hardware. This will be a lot of work though.

With pthreads it is also much easier to produce a poorly parallelised code.

Solution 5

Is there any reason (other than readability) to use OpenMP over pthreads?

Mike kind of touched upon this:

OpenMP also has similar advantages in portability in that a lot of compilers for different platforms support it now, as with pthreads

Crypto++ is cross-platform, meaning in runs on Windows, Linux, OS X and the BSDs. It uses OpenMP for threading support in places where the operation can be expensive, like modular exponentiation and modular multiplication (and where concurrent operation can be performed).

Windows does not support pthreads, but modern Windows compilers do support OpenMP. So if you want portability to the non-*nix's, then OpenMP is often a good choice.

And as Mike also pointed out:

OpenMP is great if all you want to do is add a few #pragma statements and have a parallel version of your code quite quickly.

Below is an example of Crypto++ precomputing some values used in Rabin-Williams signatures using Tweaked Roots as described by Bernstein in RSA signatures and Rabin-Williams signatures...:

void InvertibleRWFunction::Precompute(unsigned int /*unused*/)
{
    ModularArithmetic modp(m_p), modq(m_q);

    #pragma omp parallel sections
    {
        #pragma omp section
            m_pre_2_9p = modp.Exponentiate(2, (9 * m_p - 11)/8);
        #pragma omp section
            m_pre_2_3q = modq.Exponentiate(2, (3 * m_q - 5)/8);
        #pragma omp section
            m_pre_q_p = modp.Exponentiate(m_q, m_p - 2);
    }
}

It fits with Mike's observation - fine grain control and synchronization was not really needed. Parallelization was used to speed up execution, and the synchronization came at no cost in the source code.

And if OpenMP is not available, the the code reduces to:

m_pre_2_9p = modp.Exponentiate(2, (9 * m_p - 11)/8);
m_pre_2_3q = modq.Exponentiate(2, (3 * m_q - 5)/8);
m_pre_q_p = modp.Exponentiate(m_q, m_p - 2);

View more solutions

20,417

hanno

Updated on July 24, 2020

Comments

hanno almost 4 years

Most people in scientific computing use OpenMP as a quasi-standard when it comes to shared memory parallelization.

Is there any reason (other than readability) to use OpenMP over pthreads? The latter seems more basic and I suspect it could be faster and easier to optimize.
awiebe over 12 years

Please elaborate on solution scalability. Does the scalability only apply at compile time or is it determined at runtime? Or can runtime scalability only be done with threads?
P O'Conbhui about 12 years

You can set the number of threads created at either compile or runtime. If you choose to have the number set at runtime, you can set the number of threads through an environment variable numthreads, such that it can be easily set to an appropriate number on whatever architecture you're running it on.
Jeff Hammond almost 9 years

This answer makes no sense. OpenMP is a threading model just like POSIX threads. OpenMP didn't even have tasks for the first few versions.
Jeff Hammond almost 9 years

The last paragraph of this answer is all kinds of wrong.
Jeff Hammond almost 9 years

OpenMP has always supported more than data parallelism. Do you actually understand OpenMP?
Jeff Hammond almost 9 years

This assumes OpenMP is implemented using Pthreads. That is not required, although generally true. If OpenMP were implemented to bare metal on a specialized architecture, it could be faster than Pthreads.
steffen over 8 years

@Jeff I am not assuming that and my answer is indeoendent of the implementation details.OpenMP and C are more "high-level" than pthreads ans assembly. That's why I believe that both my statements remain true, no matter how C and OpenMP are implemented.
Jeff Hammond over 8 years

It seems you are conflating the syntactic simplicity of OpenMP with semantic burdens on the runtime. Have you compared the POSIX thread specification with the OpenMP 4 specification? In particular, have you considered what is required for pthread_create() vs pragma omp parallel {}?