Set CPU affinity when create a thread

29,984

Solution 1

I am sorry to be the "myth buster" here, but setting thread affinity has great importance, and it grows in importance over time as the systems we all use become more and more NUMA (Non-Uniform Memory Architecture) by nature. Even a trivial dual socket server these days has RAM connected separately to each socket, and the difference in access to memory from a socket to its own RAM to that of the neighboring processor socket (remote RAM) is substantial. In the near future, processors are hitting the market in which the internal set of cores is NUMA in itself (separate memory controllers for separate groups of cores, etc). There is no need for me to repeat the work of others here, just look for "NUMA and thread affinity" online - and you can learn from years of experience of other engineers.

Not setting thread affinity is effectively equal to "hoping" that the OS scheduler will handle thread affinity correctly. Let me explain: You have a system with some NUMA nodes (processing and memory domains). You start a thread, and the thread does some stuff with memory, e.g. malloc some memory and then process etc. Modern OS (at least Linux, others probably too) do a good job thus far, the memory is, by default, allocated (if available) from the same domain of the CPU where the thread is running. Come time, the time-sharing OS (all modern OS) will put the thread to sleep. When the thread is put back into running state, it may be made runnable on any of the cores in the system (as you did not set an affinity mask to it), and the larger your system is, the higher the chance it will be "woken up" on a CPU which is remote from the memory it previously allocated or used. Now, all your memory accesses would be remote (not sure what this means to your application performance? read more about remote memory access on NUMA systems online)

So, to summarize, affinity setting interfaces are VERY important when running code on systems that have more-than-trivial architecture -- which is rapidly becoming "any system" these days. Some thread runtime environments/libs allow for control of this at runtime without any specific programming (see OpenMP, for example in Intel's implementation of KMP_AFFINITY environment variable) - and it would be the right thing for C++11 implementers to include similar mechanisms in their runtime libs and language options (and until then, if your code is aimed for use on servers, I strongly recommend that you implement affinity control in your code)

Solution 2

Yes, there are way to make it. I came across this method on this blog link

I rewrite the code on the blog of Eli Bendersky, and the link was pasted above. You can save the code below to test.cpp and compile & run it :

 // g++ ./test.cpp  -lpthread && ./a.out
// 
#include <thread>
#include <vector>
#include <iostream>
#include <mutex>
#include <sched.h>
#include <pthread.h>
int main(int argc, const char** argv) {
  constexpr unsigned num_threads = 4;
  // A mutex ensures orderly access to std::cout from multiple threads.
  std::mutex iomutex;
  std::vector<std::thread> threads(num_threads);
  for (unsigned i = 0; i < num_threads; ++i) {
    threads[i] = std::thread([&iomutex, i,&threads] {
      // Create a cpu_set_t object representing a set of CPUs. Clear it and mark
      // only CPU i as set.
      cpu_set_t cpuset;
      CPU_ZERO(&cpuset);
      CPU_SET(i, &cpuset);
      int rc = pthread_setaffinity_np(threads[i].native_handle(),
                                      sizeof(cpu_set_t), &cpuset);
      if (rc != 0) {
        std::cerr << "Error calling pthread_setaffinity_np: " << rc << "\n";
      }
      std::this_thread::sleep_for(std::chrono::milliseconds(20));
      while (1) {
        {
          // Use a lexical scope and lock_guard to safely lock the mutex only
          // for the duration of std::cout usage.
          std::lock_guard<std::mutex> iolock(iomutex);
          std::cout << "Thread #" << i << ": on CPU " << sched_getcpu() << "\n";
        }

        // Simulate important work done by the tread by sleeping for a bit...
        std::this_thread::sleep_for(std::chrono::milliseconds(900));
      }
    });


  }

  for (auto& t : threads) {
    t.join();
  }
  return 0;
}

Solution 3

In C++ 11 you cannot set the thread affinity when the thread is created (unless the function that is being run in the thread does it on its own), but once the thread is created, you can set the affinity via whatever native interface you have by getting the native handle for the thread (thread.native_handle()), so for Linux you can get the pthread id via:

pthread_t my_thread_native = my_thread.native_handle();

Then you can use any of the pthread calls passing in my_thread_native where it wants the pthread thread id.

Note that most thread facilities are implementation specific, i.e. pthreads, windows threads, native threads for other OSes all have their own interface and types this portion of your code would not be very portable.

Share:
29,984
Peng Zhang
Author by

Peng Zhang

Junior Developer, C/C++ and Algorithm. Master program working on probabilistic methods and graph theory.

Updated on June 01, 2021

Comments

  • Peng Zhang
    Peng Zhang almost 3 years

    I want to create a C++11 thread which I want it to run on my first core. I find that pthread_setaffinity_np and sched_setaffinity can change the CPU affinity of a thread and migrate it to the specified CPU. However this affinity specification changes after the thread has run.

    How can I create a C++11 thread with specific CPU affinity (a cpu_set_t object)?

    If it is impossible to specify the affinity when initializing a C++11 thread, how can I do it with pthread_t in C?

    My environment is G++ on Ubuntu. A piece of code is appreciated.