How does fork() return for child process

30,708

Solution 1

% man fork

RETURN VALUES

Upon successful completion, fork() returns a value of 0 to the child
process and returns the process ID of the child process to the parent
process.  Otherwise, a value of -1 is returned to the parent process, no
child process is created, and the global variable [errno][1] is set to indi-
cate the error.

What happens is that inside the fork system call, the entire process is duplicated. Then, the fork call in each returns. These are different contexts now though, so they can return different return codes.

If you really want to know how it works at a low level, you can always check the source! The code is a bit confusing if you're not used to reading kernel code, but the inline comments give a pretty good hint as to what's going on.

The most interesting part of the source with an explicit answer to your question is at the very end of the fork() definition itself -

if (error == 0) {
    td->td_retval[0] = p2->p_pid;
    td->td_retval[1] = 0;
}

"td" apparently holds a list of the return values for different threads. I'm not sure exactly how this mechanism works (why there are not two separate "thread" structures). If error (returned from fork1, the "real" forking function) is 0 (no error), then take the "first" (parent) thread and set its return value to p2 (the new process)'s PID. If it's the "second" thread (in p2), then set the return value to 0.

Solution 2

The fork() system call returns twice (unless it fails).

  • One of the returns is in the child process, and there the return value is 0.

  • The other return is in the parent process, and there the return value is non-zero (either negative if the fork failed, or a non-zero value indicating the PID of the child).

The main differences between the parent and the child are:

  • They are separate processes
  • The value of PID is different
  • The value of PPID (parent PID) is different

Other more obscure differences are listed in the POSIX standard.

In one sense, the How really isn't your problem. The operating system is required to achieve the result. However, the o/s clones the parent process, making a second child process which is an almost exact replica of the parent, setting the attributes that must be different to the correct new values, and usually marking the data pages as CoW (copy on write) or equivalent so that when one process modifies a value, it gets a separate copy of the page so as not to interfere with the other. This is not like the deprecated (by me at least - non-standard for POSIX) vfork() system call which you would be wise to eschew even if it is available on your system. Each process continues after the fork() as if the function returns - so (as I said up top), the fork() system call returns twice, once in each of two processes which are near identical clones of each other.

Solution 3

Both parent and child returns different values because of manipulation of CPU registers in child's context.

Each process in linux kernel represented by task_struct. task_struct is encased(pointer) in thread_info structure which lies at the end of kernel mode stack.Whole CPU context(registers) are stored in this thread_info structure.

struct thread_info {
    struct task_struct  *task;      /* main task structure */
    struct cpu_context_save cpu_context;    /* cpu context */
}

All fork/clone() system calls calls kernel equivalent function do_fork().

long do_fork(unsigned long clone_flags,
          unsigned long stack_start,
          struct pt_regs *regs,
          unsigned long stack_size,
          int __user *parent_tidptr,
          int __user *child_tidptr)

Here is the sequence of execution

do_fork()->copy_process->copy_thread() (copy_thread is arch specific function call)

copy_thread() copies the register values from the parent and changes the return value to 0 (In case of arm)

struct pt_regs *childregs = task_pt_regs(p); 
*childregs = *regs; /* Copy  register value from parent process*/
childregs->ARM_r0 = 0; /*Change the return value*/
thread->cpu_context.sp = (unsigned long)childregs;/*Write back the value to thread info*/
thread->cpu_context.pc = (unsigned long)ret_from_fork;

When the child gets scheduled it executes a assembly routine ret_from_fork() which will returns zero. For the parent it gets the return value from the do_fork() which is pid of process

nr = task_pid_vnr(p);
return nr;

Solution 4

Steven Schlansker's answer is quite good, but just to add some more detail:

Every executing process has an associated context (hence "context switching") - this context includes, among other things, the process's code segment (containing the machine instructions), its heap memory, its stack, and its register contents. When a context switch occurs, the context from the old process is saved, and the context from the new process is loaded.

The location for a return value is defined by the ABI, to allow code interoperability. If I am writing ASM code for my x86-64 processor, and I call into the C runtime, I know that the return value is going to show up in the RAX register.

Putting these two things together, the logical conclusion is that the call to int pid = fork() results in two contexts where the next instruction to execute in each one is one that moves the value of RAX (the return value from the fork call) into the local variable pid. Of course, only one process can execute at a time on a single cpu, so the order in which these "returns" happens will be determined by the scheduler.

Solution 5

I will try to answer from the process memory layout point of view. Guys, please correct me if anything wrong or inaccurate.

fork() is the only system call for process creation (except the very beginning process 0), so the question is actually what happens with process creation in kernel. There are two kernel data structures related with process, struct proc array (aka process table) and struct user (aka u area).

To create a new process, these two data structures have to be properly created or parameterized. The straight-forward way is to align with the creater's (or parent's) proc & u area. Most data are duplicated between parent & child (e.g., the code segment), except the values in the return register (e.g. EAX in 80x86), for which parent is with child's pid and child is 0. Since then, you have two processes (existing one & new one) run by the scheduler, and upon the scheduling, each will return their values respectively.

Share:
30,708
EpsilonVector
Author by

EpsilonVector

Updated on July 05, 2022

Comments

  • EpsilonVector
    EpsilonVector almost 2 years

    I know that fork() returns differently for the child and parent processes, but I'm unable to find information on how this happens. How does the child process receive the return value 0 from fork? And what is the difference in regards to the call stack? As I understand it, for the parent it goes something like this:

    parent process--invokes fork-->system_call--calls fork-->fork executes--returns to-->system_call--returns to-->parent process.

    What happens in the child process?

  • EpsilonVector
    EpsilonVector about 14 years
    Yes, but how does the different return values thing work? How can a function return two different values?
  • Jonathan Leffler
    Jonathan Leffler about 14 years
    I wish the vfork() fan boy would explain why they gave a down vote.
  • Amber
    Amber about 14 years
    Because the core kernel code that supports fork takes care of it - there are two separate returns going on, and thus each one can have a different value.
  • Siu Ching Pong -Asuka Kenji-
    Siu Ching Pong -Asuka Kenji- about 14 years
    Nice explanation and nice link to the latest POSIX standard (Issue 7)!
  • osgx
    osgx almost 10 years
    Is this sys/kern/kern_fork.c - Linux source?
  • jxh
    jxh over 9 years
    @osgx: No, the quoted source is from FreeBSD.
  • programmersn
    programmersn almost 2 years
    Finally an answer that actually addresses OP's question. Thanks mate