The concept of a process is fundamental to any multiprogramming operating system. A process
is usually defined as an instance of a program in execution; thus, if 16 users are running vi at once,
there are 16 separate processes (although they can share the same executable code). Processes
are often called tasks or threads in the Linux source code.
In this chapter, we discuss static properties of processes and then describe how process switching
is performed by the kernel. The last two sections describe how process can be created and destroyed.
We also describe how linux supports multithreaded applications, it relies on so-called light-weight
process (LWP).
Processes, Lightweight Processes, and Threads
The term "process" is often used with several different meanings. In this book, we stick to the usual
OS textbook definition: a process is an instance of a program in execution. You might think of it as the
collection of data structures that fully describes how far the execution of the program has progressed.
Processes are like human beings: they are generated, they have a more or less significant life, they
optionally generate one or more child processes, and eventually they die. A small difference is that
sex is not really common among processes --- each process has just one parent.
From the kernel's point of view, the purpose of a process is to act as an entity to which system resources
(CPU time, memory, etc.) are allocated.
When a process is created, it is almost identical to its parent. It receives a logical copy of the parent's
address space and executes the same code as the parent, beginning at the next instruction following
the process system call. Although the parent and child may share the pages containing the program code
(text), they have separate copies of the data (stack and heap), so that changes by the child to a memory
location are invisible to the parent (and vice versa).
While earlier Unix kernels employed this simple model, modern Unix systems do not. They support
multithreaded applications --- user programs having many relatively independent execution flows sharing
a large portion of the application data structures. In such systems, a process is composed of several user
threads (or simply threads), each of which represents an execution flow of the process. Nowadays, most
multithreaded applications are written using standard sets of library functions called pthread (POSIX) libraries.
Old versions of the Linux kernel offered no support for multithreaded applications. From the kernel point of
view, a multithreaded application was just a normal process. The multiple execution flows of a multithreaded
application were created, handled, and scheduled entirely in User Mode, usually by means of a POSIX-compliant
pthread library.
However, such an implementation of multithreaded applications is not very satisfactory. For instance, suppose
a chess program uses two threads: one of them controls the graphical chessboard, waiting for the moves of the
human player and showing the moves of the computer, while the other thread ponders the next move of the
game. While the first thread waits for the human move, the second thread should run continuously, thus exploiting
the thinking time of human player. However, if the chess program is just a single process, the first thread can
not simply issue a blocking system call waiting for a user action; otherwise, the second thread is blocked as well.
Instead, the first thread must emply sophisticated nonblocking techniques to ensure that the process remains
runnable.
Linux uses lightweight processes to offer better support for multithreaded applications. Basically, two lightweight
processes may share some resources, like the address space, the open files, and so on. Whenever one of them
modifies a shared resources, the other immediately sees the change. Of course, the two processes must synchronize
themselves when accessing the shared resource.
A straightforward way to implement multithreaded application is to associate a lightweight process with each
thread. In this way, the threads can access the same set of application data structures by simply sharing the same
memory address space, the same set of open files, and so on; at the same time, each thread can be scheduled
independently by the kernel so that one may sleep while another remains runnable.