Threads vs Processes in Linux

125,898

Solution 1

Linux uses a 1-1 threading model, with (to the kernel) no distinction between processes and threads -- everything is simply a runnable task. *

On Linux, the system call clone clones a task, with a configurable level of sharing, among which are:

  • CLONE_FILES: share the same file descriptor table (instead of creating a copy)
  • CLONE_PARENT: don't set up a parent-child relationship between the new task and the old (otherwise, child's getppid() = parent's getpid())
  • CLONE_VM: share the same memory space (instead of creating a COW copy)

fork() calls clone(least sharing) and pthread_create() calls clone(most sharing). **

forking costs a tiny bit more than pthread_createing because of copying tables and creating COW mappings for memory, but the Linux kernel developers have tried (and succeeded) at minimizing those costs.

Switching between tasks, if they share the same memory space and various tables, will be a tiny bit cheaper than if they aren't shared, because the data may already be loaded in cache. However, switching tasks is still very fast even if nothing is shared -- this is something else that Linux kernel developers try to ensure (and succeed at ensuring).

In fact, if you are on a multi-processor system, not sharing may actually be beneficial to performance: if each task is running on a different processor, synchronizing shared memory is expensive.


* Simplified. CLONE_THREAD causes signals delivery to be shared (which needs CLONE_SIGHAND, which shares the signal handler table).

** Simplified. There exist both SYS_fork and SYS_clone syscalls, but in the kernel, the sys_fork and sys_clone are both very thin wrappers around the same do_fork function, which itself is a thin wrapper around copy_process. Yes, the terms process, thread, and task are used rather interchangeably in the Linux kernel...

Solution 2

Linux (and indeed Unix) gives you a third option.

Option 1 - processes

Create a standalone executable which handles some part (or all parts) of your application, and invoke it separately for each process, e.g. the program runs copies of itself to delegate tasks to.

Option 2 - threads

Create a standalone executable which starts up with a single thread and create additional threads to do some tasks

Option 3 - fork

Only available under Linux/Unix, this is a bit different. A forked process really is its own process with its own address space - there is nothing that the child can do (normally) to affect its parent's or siblings address space (unlike a thread) - so you get added robustness.

However, the memory pages are not copied, they are copy-on-write, so less memory is usually used than you might imagine.

Consider a web server program which consists of two steps:

  1. Read configuration and runtime data
  2. Serve page requests

If you used threads, step 1 would be done once, and step 2 done in multiple threads. If you used "traditional" processes, steps 1 and 2 would need to be repeated for each process, and the memory to store the configuration and runtime data duplicated. If you used fork(), then you can do step 1 once, and then fork(), leaving the runtime data and configuration in memory, untouched, not copied.

So there are really three choices.

Solution 3

That depends on a lot of factors. Processes are more heavy-weight than threads, and have a higher startup and shutdown cost. Interprocess communication (IPC) is also harder and slower than interthread communication.

Conversely, processes are safer and more secure than threads, because each process runs in its own virtual address space. If one process crashes or has a buffer overrun, it does not affect any other process at all, whereas if a thread crashes, it takes down all of the other threads in the process, and if a thread has a buffer overrun, it opens up a security hole in all of the threads.

So, if your application's modules can run mostly independently with little communication, you should probably use processes if you can afford the startup and shutdown costs. The performance hit of IPC will be minimal, and you'll be slightly safer against bugs and security holes. If you need every bit of performance you can get or have a lot of shared data (such as complex data structures), go with threads.

Solution 4

Others have discussed the considerations.

Perhaps the important difference is that in Windows processes are heavy and expensive compared to threads, and in Linux the difference is much smaller, so the equation balances at a different point.

Solution 5

Once upon a time there was Unix and in this good old Unix there was lots of overhead for processes, so what some clever people did was to create threads, which would share the same address space with the parent process and they only needed a reduced context switch, which would make the context switch more efficient.

In a contemporary Linux (2.6.x) there is not much difference in performance between a context switch of a process compared to a thread (only the MMU stuff is additional for the thread). There is the issue with the shared address space, which means that a faulty pointer in a thread can corrupt memory of the parent process or another thread within the same address space.

A process is protected by the MMU, so a faulty pointer will just cause a signal 11 and no corruption.

I would in general use processes (not much context switch overhead in Linux, but memory protection due to MMU), but pthreads if I would need a real-time scheduler class, which is a different cup of tea all together.

Why do you think threads are have such a big performance gain on Linux? Do you have any data for this, or is it just a myth?

Share:
125,898

Related videos on Youtube

user17918
Author by

user17918

Updated on March 17, 2022

Comments

  • user17918
    user17918 about 2 years

    I've recently heard a few people say that in Linux, it is almost always better to use processes instead of threads, since Linux is very efficient in handling processes, and because there are so many problems (such as locking) associated with threads. However, I am suspicious, because it seems like threads could give a pretty big performance gain in some situations.

    So my question is, when faced with a situation that threads and processes could both handle pretty well, should I use processes or threads? For example, if I were writing a web server, should I use processes or threads (or a combination)?

    • mouviciel
      mouviciel about 15 years
      Is there a difference with Linux 2.4?
    • MarkR
      MarkR about 15 years
      The difference between processes and threads under Linux 2.4 is that threads share more parts of their state (address space, file handles etc) than processes, which usually don't. The NPTL under Linux 2.6 makes this a bit clearer by giving them "thread groups" which are a bit like "processes" in win32 and Solaris.
    • ephemient
      ephemient about 15 years
      Yes, NPTL is nice: it makes things like kill, exec, etc. work as you would expect in a threaded program (the old LinuxThreads behaviors make sense given the implementation, but were icky). OTOH a "thread group" is just a collection of "threads", and doesn't really take up resources itself, so it's a ton lighter-weight than a NT or Solaris process.
    • neal aise
      neal aise almost 14 years
      httpd.apache.org/docs/2.0/mod/worker.html is the default for apache webserver. its a multi process multi thread configuration.
    • Lutz Prechelt
      Lutz Prechelt over 8 years
      Concurrent programming is difficult. Unless you need very high performance, the most important aspect in your tradeoff will often be the difficulty of debugging. Processes make for the much easier solution in this respect, because all communication is explicit (easy to check, to log etc.). In contrast, the shared memory of threads creates gazillions of places where one thread can erroneously impact another.
    • iankit
      iankit over 8 years
      @LutzPrechelt - Concurrent programming can be multi-threaded as well as multi-process. I dont see why you are assuming concurrent programming is multi threaded only. It might be because of some particular language limitations but in general it can be both.
    • user2692263
      user2692263 about 7 years
      I link Lutz merely stated that concurrent programming is difficult whichever is chosen - process or threads - but that concurrent programming using processes makes for easier debugging in many cases.
  • ephemient
    ephemient about 15 years
    I do have the rep to edit, but I don't quite agree. Context switches between processes on Linux is almost as cheap as context switches between threads.
  • ephemient
    ephemient about 15 years
    I used thread-local storage for a some statistics gathering, the last time I was writing a threaded networks program: each thread wrote to its own counters, no locks needed, and only when messaged would each thread combine its stats into the global totals. But yeah, TLS is not very commonly used or necessary. Shared memory, on the other hand... in addition to efficiently sending data, you can also share POSIX semaphores between processes by placing them in shared memory. It's pretty amazing.
  • user17918
    user17918 almost 15 years
    Yes, I do have some data. I ran a test that creates 100,000 processes and a test that creates 100,000 threads. The thread version ran about 9x faster (17.38 seconds for processes, 1.93 for threads). Now this does only test creation time, but for short-lived tasks, creation time can be key.
  • Rick Ellison
    Rick Ellison over 14 years
    Adam's answer would serve well as an executive briefing. For more detail, MarkR and ephemient provide good explanations. A very detailed explanation with examples may be found at cs.cf.ac.uk/Dave/C/node29.html but it does appear to be a bit dated in parts.
  • codingfreak
    codingfreak about 13 years
    @user17918 - Is it possible for you to share the code used by you to calculate above mentioned timings ..
  • MarkR
    MarkR over 12 years
    @Qwertie forking is not that cool, it breaks lots of libraries in subtle ways (if you use them in the parent process). It creates unexpected behaviour which confuses even experienced programmers.
  • Saurabh
    Saurabh about 12 years
    I think we are missing 1 point. If you make multiple process for your web server, then you have to write another process to open the socket and pass 'work' to different threads. Threading offers a single process multiple threads, clean design. In many situations thread is just natural and in other situation a new process is just natural. When the problem falls in a gray area the other trade offs as explained by ephemient becomes important.
  • ephemient
    ephemient about 12 years
    @Saurabh Not really. You can easily socket, bind, listen, fork, and then have multiple processes accept connections on the same listening socket. A process can stop accepting if it's busy, and the kernel will route incoming connections to another process (if nobody is listening, kernel will queue or drop, depending on listen backlog). You don't have much more control over work distribution than that, but usually that's good enough!
  • Ehtesh Choudhury
    Ehtesh Choudhury over 11 years
    @MarkR could you give some examples or a link of how forking breaks library and creates unexpected behavior?
  • MarkR
    MarkR over 11 years
    If a process forks with an open mysql connection, bad things happen, as the socket is shared between two processes. Even if only one process uses the connection, the other stops it from being closed.
  • n611x007
    n611x007 almost 11 years
    "the data may already be loaded in cache" - what cache exactly?
  • c4f4t0r
    c4f4t0r over 10 years
    one big different, with processes the kernel create page table for every process and theads use only one page tables, so i think is normal the threads are faster then processes
  • Russell Stuart
    Russell Stuart about 10 years
    CyberFonic's is true for Windows. As ephemient says under Linux processes aren't heavier. And under Linux all the mechanisms available for communication between threads (futex's,shared memory, pipes, IPC) is also available for processes and run at the same speed.
  • Lawrence Jones
    Lawrence Jones about 10 years
    Naxa, the cache that's being referred to is the page table cache. COW ensures that initially the two threads will share the same memory - ie, each thread will point to the same physical place in memory for it's program data. This means the kernel hasn't had to perform any swapping/paging as the data is already there, probably already loaded into main memory.
  • Stanimirovv
    Stanimirovv almost 10 years
    There is one thing which I do not understand from this answer: If threads and processes are the same to linux, how and when do we achieve shared resources for the threads?
  • ephemient
    ephemient almost 10 years
    @Bloodcount All processes/threads on Linux are created by the same mechanism, which clones an existing process/thread. Flags passed to clone() determine which resources are shared. A task can also unshare() resources at any later point in time.
  • Karthik Balaguru
    Karthik Balaguru over 9 years
    A single process can contain multiple threads and hence how is it true that the terms process, thread, and task are used rather interchangeably in the Linux kernel. Can you please point where exactly it is claimed in linux ?
  • Karthik Balaguru
    Karthik Balaguru over 9 years
    Another simple way to look at it is TCB is pretty smaller than PCB and so it is obvious that process context switch that involves PCB will consume bit more time than that of switching of threads.
  • ephemient
    ephemient over 9 years
    @KarthikBalaguru Within the kernel itself, there is a task_struct for each task. This is often called a "process" throughout the kernel code, but it corresponds to each runnable thread. There is no process_struct; if a bunch of task_structs are linked together by their thread_group list, then they're the same "process" to userspace. There's a little bit of special handling of "thread"s, e.g. all sibling threads are stopped on fork and exec, and only the "main" thread shows up in ls /proc. Every thread is accessible via /proc/pid though, whether it's listed in /proc or not.
  • ephemient
    ephemient over 9 years
    @KarthikBalaguru The kernel supports a continuum of behavior between threads and processes; for example, clone(CLONE_THREAD | CLONE_VM | CLONE_SIGHAND)) would give you a new "thread" that doesn't share working directory, files or locks, while clone(CLONE_FILES | CLONE_FS | CLONE_IO) would give you a "process" that does. The underlying system creates tasks by cloning; fork() and pthread_create() are just library functions that invoke clone() differently (as I wrote in this answer).
  • olegst
    olegst over 8 years
    What do you mean no benefit? How about performing heavy calculations in GUI thread? Moving them to parallel thread will be much better from a point of user experience, no matter how CPU is loaded.
  • Lie Ryan
    Lie Ryan over 8 years
    fork() system call is specified by POSIX (which means it's available on any Unix systems), if you used the underlying Linux API, which is the clone() system call, then you actually have even more choices in Linux than just the three.
  • Lelanthran
    Lelanthran almost 7 years
    @MarkR The sharing of the socket is by design. Besides, either of the processes can close the socket using linux.die.net/man/2/shutdown before calling close() on the socket.
  • batbrat
    batbrat over 5 years
    You mention that not sharing may be good on multiprocessor systems. However, just using multiprocessing doesn't guarantee that we will not synch. Esp. if we use shared memory and not messaging.
  • abhiarora
    abhiarora over 4 years
    Does this answer require modification considering latest version of Linux Kernel?
  • abhiarora
    abhiarora over 4 years
    IPC is harder to use but what if someone uses "shared memory"?