What happens internally in Linux Kernel when Process and Thread is created.

In our post, we have came to known that Linux implements threads as standard processes. Let's see in this post, how they are implemented internally.

In C, we use fork() to create new process and pthread_create to create a new thread. Both internally uses clone system call provided by Linux Kernel.

Process Creation example:

#include <stdio.h>
int main()
{
fork();
return 0;
}

To find the system call :strace ./fork

clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fec70651a10) = 4269

Thread Creation Example:

#include <stdio.h>
#include <pthread.h>
void *func(void *arg)
{
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t thread;
pthread_create(&thread,NULL, func, NULL);
return 0;
}

To find the system call: strace ./thread

clone(child_stack=0x7f74bf7a2fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f74bf7a39d0, tls=0x7f74bf7a3700, child_tidptr=0x7f74bf7a39d0) = 4361

You can see from strace, in both thread and process creation, clone system call is used and arguments are changed.

To understand the difference between what gets shared and not, take the flags of clone passed while creating thread.

Flags passed to clone while creating Thread:

1. CLONE_VM: Virtual Memory is shared between the calling process and child process. That is if any of the calling process or child process modify the memory it will be visible in the other process.


You can see in fork.c, If CLONE_VM is set it will assign the same struct mm_struct to the new process, else it will duplicate it and they will have different virtual memory. Process Address space is represented with mm_struct in Linux Kernel.

Example Code to demonstrate the behavior of clone with and without CLONE_VM:


2. CLONE_FS: File system attributes are shared. E.g. root directory, current working directory, file mode creation mask.


You can see with the flag set the number of users are just incremented in struct fs_struct

Example code to demonstrate the behavior of CLONE_FS when set or cleared in flags field of clone system call.


3. CLONE_FILES: If set, Parent and Child will share the table of open file descriptors. These descriptors are those values returned by open, socket, pipe etc. Modification in child will also modify the file descriptor table in parent.


You can see the kernel implementation, the reference count is incremented when CLONE_FILES is set, else the table is duplicated and modifications in either process will not affect the other process.


Example code to demonstrate the behavior of CLONE_FILES when set or cleared in flags field of clone system call.


4. CLONE_SIGHAND: If set, Parent and child processes will share the same signal handler table. Modification by one process will affect the other process.



In Kernel implementation, when the CLONE_SIGHAND flag is set, it increments the count of the struct sighand_struct (signal handler table), else allocates memory for newer signal table and copies the parent process signal table into it.

Example Code to demonstrate the behavior of CLONE_SIGHAND when set or cleared in flags field of clone system call.

https://embeddedguruji.blogspot.com/2018/12/clonesighand-example.html


CLONE_THREAD: Child is placed in the same thread group id as the parent. It means the pid of the parent and child will be the same, but the thread id will be different. You can get thread id by calling gettid() API.


CLONE_SYSVEM: Parent and child share a single list of System V semaphore undo values

CLONE_PARENT_SETTID, CLONE_CHILD_CLEARTID, CLONE_CHILD_SETTID: This operates on the set_child_tid, clear_child_tid fields of the task_struct variable.


CLONE_PARENT_SETID if set,  kernel writes the thread ID of the child thread into the location pointed to by ptid.

CLONE_CHILD_CLEARTID -> If set, then clone() zeroes the memory location pointed to by ctid when the child terminates.

SIGCHLD -> Used for signaling the parent when the child process dies.

Comments

Popular posts from this blog

bb.utils.contains yocto

make config vs oldconfig vs defconfig vs menuconfig vs savedefconfig

PR, PN and PV Variable in Yocto