5 Operating System A program that controls the execution of application programs An interface between applications and hardware Main objectives of an OS: Convenience Efficiency Ability to evolve
6 Services Provided by the Operating System Program development Editors and debuggers Program execution OS handles scheduling of numerous tasks required to execute a program Access I/O devices Each device will have unique interface OS presents standard interface to users
7 Controlled access to files Accessing different media but presenting a common interface to users Provides protection in multi-access systems System access Controls access to the system and its resources Services Provided by the Operating System
8 Role of an OS A computer is a set of resources for the movement, storage, and processing of data. The OS is responsible for managing these resources.
9 Modes of Operation User Mode Less-privileged mode User program executes in user mode Certain areas of memory protected from user access Certain instructions may not be executed System Mode More-privileged mode Kernel of the operating system Privileged instructions may be executed, all memory accessible.
10 Uniprogramming Processor must wait for I/O instruction to complete before proceeding
11 Multiprogramming When one job needs to wait for I/O, the processor can switch to the other job
13 Time Sharing Systems Using multiprogramming to handle multiple interactive jobs Processor s time is shared among multiple users Multiple users simultaneously access the system through terminals
14 Problems and Issues Multiple jobs in memory must be protected from each other s data. File system must be protected so that only authorised users can access. Contention for resources must be handled Printers, storage etc
15 OS Manages Execution of Applications Resources are made available to multiple applications The processor is switched among multiple application The processor and I/O devices can be used efficiently
17 What is a process ? Fundamental to the structure of OS s A process is: a program in execution an instance of a program running on a computer the entity that can be assigned to and executed on a processor a unit of activity characterized by the execution of a sequence of instructions (a single sequential thread of execution), a current state, and an associated set of system instructions
18 Components of a Process A process consists of An executable program Associated data needed by the program Execution context of the program (or process state ) The execution context contains all information the operating system needs to manage the process
20 Process Elements While the process is running it has a number of elements stored in a data structure, typically called Process Control Block which includes : Identifier (a unique identifier associated with this process to distinguish it from all other processes) State (if the process is currently executing, it is in the running state) Priority (priority level relative to other processes) Program counter (address of the next instruction in the program to be executed) Memory pointers (include pointers to the program code and data associated with this process, plus any memory blocks shared with other processes) Context data (these are data that are present in registers in the processor while the process is executing) I/O status information (includes outstanding I/O requests, I/O devices assigned to this process, a list of files in use by the process, and so on) Accounting information (may include the amount of processor time and clock time used, time limits, account numbers, and so on)
21 Process Control Block Contains the process elements Created and managed by the operating system Allows support for multiple processes
22 Trace of the Process The behavior of an individual process is shown by listing the sequence of instructions that are executed. This list is called a Trace Dispatcher is a small program which switches the processor from one process to another
23 Process Execution Consider three processes being executed All are in memory (plus the dispatcher)
24 Trace from the processes point of view Each process runs to completion
25 Traces from Processor point of view Timeout I/O Timeout
26 Process Creation Once the OS decides to create a new process it: Assigns a unique process identifier Allocates space for the process Initializes process control block Sets up appropriate linkages Creates or expand other data structures
27 Events which cause process creation: System initialization When an operating system is booted, typically several processes are created. Execution of a process creation system call by a running process. a running process will issue system calls to create one or more new processes to help it to do its job. A user request to create a new process In interactive systems, user can start a program by typing a command or double clicking an icon Initiation of a batch job In a batch system, users can submit batch jobs to the system (possibly remotely). When the operating system decides that it has the resources to run another job, it creates a new process and runs the next job from the input queue in it. Process Creation
28 Process Creation The OS builds a data structure to manage the process Traditionally, the OS created all processes But it can be useful to let a running process create another This action is called process spawning Parent Process is the original, creating, process Child Process is the new process
29 Process creation In UNIX, there is only one system call to create a new process: fork. This call creates an exact clone of the calling process. After the fork, the two processes, the parent and the child, have the same memory image, the same environment strings, and the same open files. Usually, the child process then executes execve or a similar system call to change its memory image and run a new program. For example, when a user types a command, say sort, to the shell, the shell forks off a child process and the child executes sort. The reason for this two-step process is to allow the child to manipulate its file descriptors after the fork but before the execve in order to accomplish redirection of standard input, standard output, and standard error.
30 Process Creation In windows, in contrast, a single Win32 function call, CreateProcess, handles both process creation and loading the correct program into the new process. In addition to create process, Win 32 has about 100 other functions for managing and synchronizing processes. In both UNIX and Windows, after a process is created, the parent and child have their own distinct address spaces. If either process changes a word in its address space, the change is not visible to the other process. In UNIX, the childs initial address space is a copy of the parents, but there are definitely two distinct address spaces involved; no writable memory is shared (some UNIX implementations share the program text between the two since that cannot be modified). It is, however, possible for a newly created process to share some of its creators other resources, such as open files. In Windows, the parents and childs address spaces are different from the start.
31 Events which cause process termination Normal exit (voluntary) Most processes terminate because they have done their work Error exit (voluntary) The second reason for termination is that the process discovers a fatal error Fatal error (involuntary) The third reason for termination is an error caused by the process, often due to a program bug Killed by another process (involuntary) The fourth reason a process might terminate is that the process executes a system call telling the operating system to kill some other process. Process Termination
32 Process Termination There must be some way that a process can indicate completion. This indication may be: A HALT instruction generating an interrupt alert to the OS. A user action (e.g. log off, quitting an application) A fault or error Parent process terminating
33 Process termination Most processes terminate because they have done their work. When a compiler has compiled the program given to it, the compiler executes a system call to tell the operating system that it is finished. This call is exit in UNIX and ExitProcess in Windows. Screen-oriented programs also support voluntary termination. Word processors, Internet browsers and similar programs always have an icon or menu item that the user can click to tell the process to remove any temporary files it has open and then terminate.
34 Process termination The second reason for termination is that the process discovers a fatal error. For example, if a user types the command cc foo.c To compile the program foo.c and no such file exists, the compiler simply exits. Screen oriented interactive processes generally do not exit when given bad parameters. Instead they pop up a dialog box and ask the user to try again.
35 Process termination The third reason for termination is an error caused by the process, often due to a program bug. Examples include executing an illegal instruction, referencing non-existent memory, or dividing by zero. In some systems (e.g., UNIX), a process can tell the operating system that it wishes to handle certain errors itself, in which case the process is signaled (interrupted) instead when one of the errors occurs.
36 Process termination The fourth reason a process might terminate is that the process executes a system call telling the operating system to kill some other process. In UNIX this call is kill. The corresponding Win 32 function is TerminateProcess. In both cases the killer must have the necessary authorization to do in the killee.
37 Process Birth and Death CreationTermination New batch jobNormal Completion Interactive LoginMemory unavailable Created by OS to provide a service Protection error Spawned by existing process Operator or OS Intervention
38 Process States A process may be in one of the following three states : 1. Running (actually using the CPU at that instant) 2. Ready (Runnable; temporarily stopped to let another process run) 3. Blocked (unable to run until some external event happens) Logically the first two states are similar. In both cases the process is willing to run, only in the second one, there is temporarily no CPU available for it. The third state is different from the first two in that the process cannot run, even if the CPU has nothing else to do.
39 A process can be in running, blocked, or ready state. Transitions between these states are as shown. Process States
40 Process States Four transitions are possible among these three states Transition 1 occurs when the operating system discovers that a process cannot continue right now. In some systems the process can execute a system call, such as pause, to get into blocked state. In other systems, including UNIX, when a process reads from a pipe or special file (e.g. a terminal) and there is no input available, the process is automatically blocked.
41 Process states Transitions 2 and 3 are caused by the process scheduler, a part of the operating system, without the process even knowing about them. Transition 2 occurs when the scheduler decides that the running process has run long enough, and it is time to let another process have some CPU time. Transition 3 occurs when all the other processes have had their fair share and it is time for the first process to get the CPU to run again. Transition 4 occurs when the external event for which a process was waiting (such as the arrival of some input) happens. If no other process is running at that instant, transition 3 will be triggered and the process will start running. Otherwise it may have to wait in ready sate until the CPU is available and its turn comes.
42 Implementation of Processes To implement the process model, the operating system maintains a table (an array of structures), called the process table, with one entry per process, called process control blocks.) This entry contains important information about the process state, including its program counter, stack pointer, memory allocation, the status of its open files, its accounting and scheduling information, and everything else about the process that must be saved when the process is switched from running to ready or blocked state so that it can be restarted later as if it had never been stopped.
43 Some of the fields of a typical process table entry Implementation of Processes
44 Implementation of Processes Associated with each I/O class is a location (typically at a fixed location near the bottom of memory) called the interrupt vector. It contains the address of the interrupt service procedure. Suppose that user 3 is running when a disk interrupt happens. User process 3s program counter, program status word, and sometimes one or more registers are pushed onto the (current) stack by the interrupt hardware. The computer then jumps to the address specified in the interrupt vector. That is all the hardware does. From hereon it is up to the software, in particular, the interrupt service procedure.
45 Implementation of Processes All interrupts start by saving the registers, often in the process table entry for the current process. Then the information pushed onto the stack by the interrupt is removed and the stack pointer is set to point to a temporary stack used by the process handler. Actions such as saving the registers and setting the stack pointer cannot even be expressed in high-level languages such as C, so they are performed by a small assembly language routine, usually the same one for all interrupts since the work of saving the registers is identical, no matter what the cause of the interrupt is. When this routine is finished, it calls a C-procedure to do the rest of the work for this specific interrupt type. (we assume the operating system is written in C, the usual choice for all real operating systems.)
46 Implementation of Processes When it has done its job, possibly making some process now ready, the scheduler is called to see who to run next. After that control is passed back to the assembly language code to load up the registers and memory map for the now- current process and start it running. When the process finishes, the operating system displays a prompt character and waits for a new command. When it receives the command, it loads a new program into memory, overwriting the first one.
47 Skeleton of what the lowest level of the operating system does when an interrupt occurs. Implementation of Processes
48 Operating System Control Structures For the OS to manage processes and resources, it must have information about the current status of each process and resource. Tables are constructed for each entity the operating system manages
50 Memory Tables Memory tables are used to keep track of both main and secondary memory. Must include this information: Allocation of main memory to processes Allocation of secondary memory to processes Protection attributes for access to shared memory regions Information needed to manage virtual memory
51 I/O Tables Used by the OS to manage the I/O devices and channels of the computer. The OS needs to know Whether the I/O device is available or assigned The status of I/O operation The location in main memory being used as the source or destination of the I/O transfer
52 File Tables These tables provide information about: Existence of files Location on secondary memory Current Status other attributes. Sometimes this information is maintained by a file management system
53 Process Tables To manage processes the OS needs to know the details of the processes Current state Process ID Location in memory Etc Process control block Process image is the collection of program, Data, stack, and attributes
54 Switching Processes Several design issues are raised regarding process switching What events trigger a process switch? We must distinguish between mode switching and process switching. What must the OS do to the various data structures under its control to achieve a process switch?
55 When to switch processes MechanismCauseUse InterruptExternal to the execution of the current instruction Reaction to an asynchronous external event TrapAssociated with the execution of the current instruction Handling of an error or an exception condition Supervisor callExplicit requestCall to an operating system function Mechanisms for Interrupting the Execution of a Process A process switch may occur any time that the OS has gained control from the currently running process. Possible events giving OS control are:
56 Change of Process State The steps in a process switch are: 1. Save context of processor including program counter and other registers 2. Update the process control block of the process that is currently in the Running state 3. Move process control block to appropriate queue – ready; blocked
57 4. Select another process for execution 5. Update the process control block of the process selected 6. Update memory-management data structures 7. Restore context of the selected process Change of Process State
58 Threads Process is divided into threads that can run concurrently The unit of dispatching is referred to as a thread or lightweight process Thread Dispatchable unit of work executes sequentially and is interruptible Process is a collection of one or more threads
59 THREADS The main reasons for having threads is that in many applications, multiple activities are going on at once. Some of these may block from time to time. By decomposing such an application into multiple sequential threads that run in quasi parallel, the programming model becomes simpler. Only now with threads we add a new element: the ability for the parallel entities to share an address space and all of its data among themselves. This ability is essential for certain applications, which is why having multiple processes (with their separate address spaces) will not work.
60 THREADS A second argument for having threads is that since they are lighter weight than processes, they are easier (i.e., faster) to create and destroy than processes. In many systems, creating a thread goes times faster than creating a process. When the number of threads needed changes dynamically and rapidly, this property is useful to have.
61 THREADS A third reason for having threads is also a performance argument. Threads yield no performance gain when all of them are CPU bound, but when there is substantial computing I/O, having threads allows these activities to overlap, thus speeding up the application. Finally threads are useful on systems with multiple CPUs, where real parallelism is possible.
62 Multithreading The ability of an OS to support multiple, concurrent paths of execution within a single process.
63 Multithreading Java run-time environment is a single process with multiple threads Multiple processes and threads are found in Windows, Solaris, and many modern versions of UNIX
64 One or More Threads in Process Each thread has An execution state (running, ready, etc.) Saved thread context when not running An execution stack Some per-thread static storage for local variables Access to the memory and resources of its process (all threads of a process share this)
65 Thread One view … One way to view a thread is as an independent program counter operating within a process.
67 Benefits of Threads Takes less time to create a new thread than a process Less time to terminate a thread than a process Switching between two threads takes less time than switching processes Threads can communicate with each other without invoking the kernel
68 Thread use in a Single-User System Foreground and background work Asynchronous processing Speed of execution Modular program structure
69 Threads Several actions that affect all of the threads in a process The OS must manage these at the process level. Examples: Suspending a process involves suspending all threads of the process Termination of a process, terminates all threads within the process
70 Thread Execution States States associated with a change in thread state Spawn (another thread) Block Unblock Finish (thread) Deallocate register context and stacks
71 A word processor with three threads Thread Usage
72 Thread Usage A multithreaded Web server (a) Dispatcher thread(b) Worker thread
73 The classical Thread Model The thread has a program counter that keeps track of which instruction to execute next. It has registers, which hold its current working variables. It has a stack, which contains the execution history, with one frame for each procedure called but not yet returned from.
74 The classical Thread Model Processes are used to group resources together; Threads are the entities scheduled for execution on the CPU. What threads add to the process model is to allow multiple executions to take place in the same process environment, to a large degree independent of one another.
75 The classical Thread Model Having multiple threads running in parallel in one process is analogous to having multiple processes running in parallel in one computer. In the former case the threads share physical memory, disks, printers, and other resources.
76 The classical Thread Model Because threads have some of the properties of processes, they are sometimes called lightweight processes. The term multithreading is also used to describe the situation of allowing multiple threads in the same process.
77 (a) Three processes each with one thread (b) One process with three threads. The Classical Thread Model
78 The classical thread model Different threads in a process are not as independent as different processes. All threads have exactly the same address space, which means that they also share the same global variables. Since every thread can access every memory address within the process address space, one thread can read, write, or even wipe out another threads stack. In addition to sharing an address space, all the threads can share the same set of open files, child processes, alarms, and signals.
79 Each thread has its own stack. The Classical Thread Model
80 The classical thread model Like a traditional process (i.e., a process with only one thread), a thread can be in any one of several states: running, blocked, ready, or terminated. A running thread currently has the CPU and is active. A blocked thread is waiting for some event to unblock it.
81 The classical thread model For example when a thread performs a system call to read from the keyboard, it is blocked until input is typed. A thread can block waiting for some external event to happen or for some other thread to unblock it. A ready thread is scheduled to run and will as soon as its turn comes up. The transitions between thread states are the same as the transitions between process states.
82 The classical thread Model When multithreading is present, processes normally start with a single thread present. This thread has the ability to create new threads by calling a library procedure, for example, thread_create. A parameter to thread_create typically specifies the name of a procedure for the new thread to run. It is not necessary or even possible to specify anything about the new threads address space, since it automatically runs in the address space of the creating thread.
83 The classical thread Model Sometimes threads are hierarchical, with a parent-child relationship, but often no such relationship, exists, with all threads being equal returned a thread identifier that names the new thread When a thread has finished its work, it can exit by calling a library procedure, say, thread-exit. It then vanishes and is no longer schedulable. In some thread systems, one thread can wait for a (specific) thread to exit by calling a procedure, for example, thread_join. This procedure blocks the calling thread until a (specific) thread has exited.
84 The classical thread method In this regard, thread creation and termination is very much like process creation and termination, with approximately the same options as well. Another common thread call is thread_yield, which allows a thread to voluntarily give up the CPU to let another thread run. Such a call is important because there is no clock interrupt to actually enforce multiprogramming as there is with processes. Thus it is important for threads to be polite and voluntarily surrender the CPU from time to time to give other threads a chance to run.
85 IEEE has defined a standard for threads in IEEE standard c. The Thread package it defines is called Pthreads POSIX - Portable Operating System Interface for Unix Some of the Pthreads function calls. POSIX Threads
86 Categories of Thread Implementation User Level Thread (ULT) Kernel level Thread (KLT) also called: kernel-supported threads lightweight processes.
87 (a) A user-level threads package. (b) A threads package managed by the kernel. Implementing Threads in User Space
88 User-Level Threads All thread management is done by the application The kernel is not aware of the existence of threads
89 Kernel-Level Threads Kernel maintains context information for the process and the threads No thread management done by application Scheduling is done on a thread basis Windows is an example of this approach
90 Advantages of KLT The kernel can simultaneously schedule multiple threads from the same process on multiple processors. If one thread in a process is blocked, the kernel can schedule another thread of the same process. Kernel routines themselves can be multithreaded.
91 Disadvantage of KLT The transfer of control from one thread to another within the same process requires a mode switch to the kernel.
92 Combined Approaches Thread creation done in the user space Bulk of scheduling and synchronization of threads by the application Example is Solaris
93 Creation of a new thread when a message arrives. (a) Before the message arrives. (b) After the message arrives. Pop-Up Threads Arrival of a message causes the system to create a new thread to handle the message. Such a thread is called a Pop-up thread. The result of using pop-up threads is that the latency between message arrival and the start of processing can be made very short
95 Processes frequently need to communicate with other processes Three issues : How one process can pass information to other Ensure two or more processes do not get in each others way - Example : two processes in an airline reservation system each trying to grab the last seat on a plane for a different customer Need proper sequencing when dependencies are present - if process A produces data and process B prints them, B has to wait until A has produced some data before starting to print. Interprocess Communication (IPC)
96 Two processes want to access shared memory at the same time Race Conditions
97 The part of the program where the shared memory is accessed is called the critical region or critical section Conditions required to avoid race condition: Condition 1 : No two processes may be simultaneously inside their critical regions (mutual exclusion). Condition 2 : No assumptions may be made about speeds or the number of CPUs. Condition 3 : No process running outside its critical region may block other processes. Condition 4 : No process should have to wait forever to enter its critical region. Critical Regions
98 Mutual exclusion using critical regions Critical Regions Process A enters its critical region at Time T1. A little later, at Time T2 process B attempts to enter its critical region but fails because another process is already in its critical region and only one process is allowed to enter at a time. Consequently, B is temporarily suspended until time T3 when A leaves its critical region, allowing B to enter immediately. Eventually B leaves at T4
99 Proposals for achieving mutual exclusion: Disabling interrupts Lock variables Strict alternation Peterson's solution The TSL instruction Mutual Exclusion with Busy Waiting
100 Disabling interrupts – A hardware solution On a single-processor systems, the simplest solution is to have each process disable all interrupts just after entering its critical region and re-enable them just before leaving it. With interrupts disabled, no clock interrupts can occur. The CPU is only switched from process to process as a result of clock or other interrupts With interrupts turned off the CPU will not be switched to another process. Once a process has disabled interrupts, it can examine and update the shared memory without fear that any other process will intervene. Mutual Exclusion with Busy Waiting
101 Disabling interrupts - Drawbacks Generally unattractive because it is unwise to give user processes the power to turn off interrupts. Suppose one of them did it and never turned them on again that could be the end of the system If the system is a multi-processor (with two or possibly more CPUs or multi-core) disabling interrupts affects only the CPU that executed the disable instruction and does not prevent other CPUs from interfering with operations the first CPU is performing The other CPUs will continue running and can access the shared memory. Mutual Exclusion with Busy Waiting
102 Lock variables – a software solution Consider having a single shared (lock) variable, initially 0. When a process wants to enter its critical region, it first tests the lock. If the lock is 0, the process sets it to 1 and enters the critical region. If the lock is already 1, the process just waits until it becomes 0. Thus a 0 means that no process is in its critical region, and a 1 means that some process is in its critical region. Mutual Exclusion with Busy Waiting
103 Lock variables – Drawback Fatal flaw Suppose that one process reads the lock and sees that it is 0. Before it can set the lock to 1, another process is scheduled, runs and sets the lock to 1. When the first process runs again, it will also set the lock to 1, and two processes will be in their critical regions at the same time. Mutual Exclusion with Busy Waiting
104 Strict alternation Mutual Exclusion with Busy Waiting A proposed solution to the critical region problem (a) Process 0 (b) Process 1. In both cases, be sure to note the semicolons terminating the while statements.
105 Strict alternation In the figure, integer variable turn, initially 0, keeps track of whose turn it is to enter the critical region and examine or update the shared memory. Initially, process 0 inspects turn, finds it to be 0, and enters its critical region. Process 1 also finds it to be 0 and therefore sits in a tight loop continually testing turn to see when it becomes 1. Continuously testing a variable until some value appears is called busy waiting. It should usually be avoided, since it wastes CPU time. Only when there is a reasonable expectation that the wait will be short is busy waiting used. A lock that uses busy waiting is called a spin lock. Mutual Exclusion with Busy Waiting
106 Strict alternation When process 0 leaves the critical region, it sets turn to 1, to allow process 1 to enter its critical region. Suppose that process 1 finishes its critical region quickly, so that both processes are in their noncritical regions, with turn set to 0. Now process 0 executes its whole loop quickly, exiting its critical region and setting turn to 1. At this point turn is 1 and both processes are executing in their noncritical regions. Suddenly, process 0 finishes its noncritical region and goes back to the top of its loop. Unfortunately, it is not permitted to enter its critical region now, because turn is 1 and process 1 is busy with its noncritical region. Mutual Exclusion with Busy Waiting
107 Strict alternation It hangs in its while loop until process 1 sets turn to 0. Put differently, taking turns is not a good idea when one of the processes is much slower than the other. This situation violates condition 3 : process 0 is being blocked by a process not in its critical region. Mutual Exclusion with Busy Waiting
108 Strict alternation Going back to the spooler directory, if we now associate the critical region with reading and writing the spooler directory, process 0 would not be allowed to print another file because process 1 was doing something else. In fact, this solution requires that the processes strictly alternate in entering their critical regions, for example, in spooling files. Neither one would be permitted to spool two in a row. While this algorithm does avoid all races, it is not really candidate as a solution because it violates condition 3. Mutual Exclusion with Busy Waiting
109 Peterson's solution By combining the idea of taking turns with the idea of lock variables and warning variables, a Dutch Mathematician, T. Dekker, was the first one to device a software solution to the mutual exclusion problem that does not require strict alternation. In 1985, G.L. Peterson discovered a much simpler way to achieve mutual exclusion, thus rendering Dekkers solution obsolete. Petersons algorithm consists of two procedures written in ANSI C, which means that function prototypes should be supplied for all the functions defined and used. Mutual Exclusion with Busy Waiting
110 Petersons solution for achieving mutual exclusion Peterson's Solution
111 Peterson's solution Before using the shared variables (i.e.,before entering its critical region), each process calls enter_region with its own process number, 0 or 1, as parameter. This call will cause it to wait, if need be, until it is safe to enter. After it has finished with the shared variables, the process calls leave_region to indicate that it is done and to allow the other process to enter, if it so desires. Mutual Exclusion with Busy Waiting
112 Peterson's solution Let us see how this solution works. Initially neither process is in its critical region. Now process 0 calls enter_region. It indicates its interest by setting its array element and sets turn to 0. Since process 1 is not interested, enter_region returns immediately. If process 1 now makes a call to enter_region, it will hang there until interested goes to FALSE, an event that only happens when process 0 calls leave_region to exit the critical region. Mutual Exclusion with Busy Waiting
113 Peterson's solution Now consider the case that both processes call enter_region almost simultaneously. Both will store their process number in turn. Whichever store is done last is the one that counts; the first one is overwritten and lost. Suppose that process 1 stores last, so turn is 1. When both processes come to the while statement, process 0 executes it zero times and enters its critical region. Process 1 loops and does not enter its critical region until process 0 exits its critical region. Mutual Exclusion with Busy Waiting
114 The TSL instruction Now let us look at a proposal that requires a little help from the hardware. Some computers, especially those designed with multiple processors in mind, have an instruction like TSL REGISTER,LOCK (Test and Set lock) that works as follows. It reads the contents of the memory word lock into register RX and then stores a nonzero value at the memory address lock. Mutual Exclusion with Busy Waiting
115 The TSL instruction The operations of reading the word and storing into it are guaranteed to be indivisible- no other processor can access the memory word until the instruction is finished. The CPU executing the TSL instruction locks the memory bus to prohibit other CPUs from accessing memory until it is done. Mutual Exclusion with Busy Waiting
116 The TSL instruction It is important to note that locking the memory bus is very different from disabling interrupts. Disabling interrupts then performing a read on a memory word followed by a write does not prevent a second processor on the bus from accessing the word between the read and the write. In fact, disabling interrupts on processor 1 has no effect at all on processor 2. The only way to keep processor 2 out of the memory until processor 1 is finished is to lock the bus, which requires a special hardware facility (basically, a bus line asserting that the bus is locked and not available to processors other than the one that locked it). Mutual Exclusion with Busy Waiting
117 The TSL instruction To use the TSL instruction, we will use a shared variable, lock, to coordinate access to shared memory. When lock is 0, any process may set to 1 using the TSL instruction and then read or write the shared memory. When it is done, the process sets lock back to 0 using an ordinary move instruction. Mutual Exclusion with Busy Waiting
118 The TSL instruction How this instruction can be used to prevent two processes from simultaneously entering their critical regions? solution - a four instruction subroutine in a fictitious (but typical) assembly language The first instruction copies the old value of lock to the register and then sets lock to 1. Then the old value is compared with 0. If it is nonzero, the lock was already set, so the program just goes back to the beginning and tests it again. Mutual Exclusion with Busy Waiting
119 Entering and leaving a critical region using the TSL instruction. The TSL Instruction
120 The TSL instruction Sooner or later it will become 0 (when the process currently in its critical region is done with its critical region), and the subroutine returns, with the lock set. Clearing the lock is very simple. The program just stores a 0 in lock. No special synchronization instructions are needed. Mutual Exclusion with Busy Waiting
121 The TSL instruction One solution to the critical region problem is now straightforward. Before entering its critical region, a process calls enter_region, which does busy waiting until the lock is free; then it acquires the lock and returns. After the critical region, the process calls leave_region, which stores a 0 in lock. As with all solution based on critical regions, the processes must call enter_region and leave_region at the correct times for the method to work. If a process cheats, the mutual exclusion will fail. Mutual Exclusion with Busy Waiting
122 XCHG instruction An alternative instruction to TSL is XCHG, which exchanges the contents of two locations atomically, for example, a register and a memory word. All intel x86 CPUs use XCHG instruction for low-level synchronization. Mutual Exclusion with Busy Waiting
123 Entering and leaving a critical region using the XCHG instruction. The TSL Instruction
124 Also known as Bounded-Buffer problem Two processes share a common, fixed-size buffer. One of them, the producer, puts information into the buffer, and the other one, the consumer, takes it out. (It is also possible to generalize the problem to have m producers and n consumers, but we will only consider the case of one producer and one consumer because this assumption simplifies the solutions.) Trouble arises when the producer wants to put a new item in the buffer, but it is already full. PRODUCER CONSUMER PROBLEM
125 The producer-consumer problem with a fatal race condition. The Producer-Consumer Problem...
126 The solution is for the producer to go to sleep, to be awakened when the consumer has removed one or more items. Similarly, if the consumer wants to remove an item from the buffer and sees that the buffer is empty, it goes to sleep until the producer puts something in the buffer and wakes it up. This approach sounds simple enough, but it leads to the same kinds of race conditions we saw earlier with the spooler directory. To keep track of the number of items in the buffer, we will need a variable, count. PRODUCER CONSUMER PROBLEM
127 If the maximum number of items the buffer can hold is N, the producers code will first test to see if count is N. If it is, the producer will go sleep; if it is not, the producer will add an item and increment count. The consumers code is similar: first test count to see if it is 0. If it is, go to sleep; if it is nonzero, remove an item and decrement the counter. Each of the processes also tests to see if the other should be awakened, and if so, wakes it up. PRODUCER CONSUMER PROBLEM
128 To express system calls such as sleep and wake up in C, we will show them as calls to library routines. They are not part of the standard C library but presumably would be made available on any system that actually had these system calls. The procedures insert_item and remove_item handle the bookkeeping of putting items into the buffer and taking items out of the buffer. PRODUCER CONSUMER PROBLEM
129 Race condition It can occur because access to count is unconstrained. The following situation could possibly occur. The buffer is empty and the consumer has just read count to see if it is 0. At this instant, the scheduler decides to stop running the consumer temporarily and start running the producer. The producer inserts an item in the buffer, increments count, and notices that it is now 1. Reasoning that count was just 0, and thus the consumer must be sleeping, the producer calls wakeup to wake the consumer up. PRODUCER CONSUMER PROBLEM
130 Unfortunately, the consumer is not yet logically asleep, so the wakeup signal is lost. When the consumer next runs, it will test the value of count it previously read, find it to be 0, and go to sleep. Sooner or later the producer will fill up the buffer and also go to sleep. Both will sleep forever. PRODUCER CONSUMER PROBLEM
131 The essence of the problem here is that a wakeup sent to a process that is not (yet ) sleeping is lost. If it were not lost, everything would work. A quick fix is to modify the rules to add a wakeup waiting bit to the picture. When a wakeup is sent to process that is still awake, this bit is set. Later, when the process tries to go to sleep, if the wakeup waiting bit is on, it will be turned off, but the process will stay awake. The wakeup waiting bit is a piggy bank for storing wakeup signals. PRODUCER CONSUMER PROBLEM
132 While the wakeup waiting bit saves the day in this simple example, it is easy to construct examples with three or more processes in which one wakeup waiting bit is insufficient. We could make another patch and add a second wakeup waiting bit, or may be 8 or 32 of them, but in principle the problem is still there. PRODUCER CONSUMER PROBLEM
133 This was the situation in 1965, when E.W. Dijkstra (1965) suggested using an integer variable to count the number of wakeups saved for future use. In his proposal, a new variable type, which he called a semaphore, was introduced. A semaphore could have the value 0, indicating that no wakeups were saved, or some positive value if one or more wakeups were pending SEMAPHORES
134 Dijkstra proposed having two operations, down and up (generalizations of sleep and wakeup, respectively) The down operation on a semaphore checks to see if the value is greater than 0. If so, it decrements the value (i.e., uses up one stored wakeup) and just continues. If the value is 0, the process is put to sleep without completing the down for the moment. Checking the value, changing it and possibly going to sleep, are all done as a single, indivisible atomic action. SEMAPHORES
135 It is guaranteed that once a semaphore operation has started, no other process can access the semaphore until the operation has completed or blocked. This atomicity is absolutely essential to solving synchronization problems avoiding race conditions. Atomic actions, in which a group of related operations are either all performed without interruption or not performed at all, are extremely important in many other areas of computer science as well. SEMAPHORES
136 The up operation increments the value of the semaphore addressed. If one or more processes were sleeping on that semaphore, unable to complete an earlier down operation, one of them is chosen by the system (e.g., at random) and is allowed to complete its down. Thus after an up on a semaphore with processes sleeping on it, the semaphore will still be 0, but there will be one fewer process sleeping on it. SEMAPHORES
137 No process ever blocks doing an up, just as no process ever blocks doing a wakeup in the earlier model. In Dijkstras original paper, he used the names P and V instead of down and up, respectively. Since these have no mnemonic significance to people who do not speak Dutch and only marginal significance to those who do Proberen (try) and Verhogen (raise, make higher) There were first introduced in the Algol 68 programming language. Semaphores
138 Solving the Producer-Consumer Problem Using Semaphores Semaphores solve the lost-wakeup problem To make them work correctly, it is essential that they be implemented in an indivisible way. The normal way is to implement up and down as system calls, with the operating system briefly disabling all interrupts while it is testing the semaphore, updating it, and putting the process to sleep, if necessary. As all of these actions take only a few instructions, no harm is done in disabling interrupts. Semaphores
139 If multiple CPUs are being used, each semaphore should be protected by a lock variable, with the TSL or XCHG instructions used to make sure that only one CPU at a time examines the semaphore. Be sure you understand that using TSL or XCHG to prevent several CPUs from accessing the semaphore at the same time is quite different from the producer or consumer busy waiting for the other to empty or fill the buffer. The semaphore operation will only take a few microseconds, whereas the producer or consumer might take arbitrarily long. Semaphores
140 This solution uses three semaphores: one called full for counting the number of slots that are full, one called empty for counting the number of slots that are empty, one called mutex to make sure the producer and consumer do not access the buffer at the same time. Full is initially 0, empty is initially equal to the number of slots in the buffer, and mutex is initially 1. Semaphores
141 The producer-consumer problem using semaphores Semaphores
142 Semaphores that are initialized to 1 and used by two or more processes to ensure that only one of them can enter its critical region at the same time are called binary semaphores. If each process does a down just before entering its critical region and an up just after leaving it, mutual exclusion is guaranteed. Semaphores
143 In a system using a semaphores, the natural way to hide interrupts is to have a semaphore, initially set to 0, associated with each I/O device. Just after starting an I/O device, the managing process does a down on the associated semaphore, thus blocking immediately. When the interrupt comes in, the interrupt handler then does an up on the associated semaphore, which makes the relevant process ready to run again. Semaphores
144 One use of semaphore is the mutex semaphore which is used for mutual exclusion. It is designed to guarantee that only one process at a time will be reading or writing the buffer and the associated variables. This mutual exclusion is required to prevent chaos. Other use of semaphores is for synchronization. The full and empty semaphores are needed to guarantee that certain event sequences do or do not occur In this case, they ensure that the producer stops running when the buffer is full, and that the consumer stops running when it is empty. This use is different from mutual exclusion. Use of Semaphores
145 When the semaphores ability to count is not needed, a simplified version of the semaphore, called a mutex, is sometimes used. Mutexes are good only for managing mutual exclusion to some shared resource or piece of code. They are easy and efficient to implement, which makes them especially useful in thread packages that are implemented entirely in user space. MUTEXES
146 A mutex is a variable that can be in one of two states: unlocked or locked Consequently, only 1 bit is required to represent it, but in practice an integer often is used, with 0 meaning unlocked and all other values meaning locked. Two procedures are used with mutexes. When a thread (or process) needs access to a critical region, it calls mutex_lock. If the mutex is currently unlocked (meaning that the critical region is available), the call succeeds and the calling thread is free to enter the critical region. MUTEXES
147 On the other hand, if the mutex is already locked, the calling thread is blocked until the thread in the critical region is finished and calls mutex_unlock. If multiple threads are blocked on the mutex, one of them is chosen at random and allowed to acquire the lock. Because mutexes are so simple, they can easily be implemented in user space provided that a TSL or XCHG instruction is available. MUTEXES
148 Implementation of mutex lock and mutex unlock Mutexes The solution with XCHG is essentially the same
149 The code of mutex_lock is similar to the code of enter_region but with a crucial difference. When enter_region fails to enter the critical region, it keeps testing the lock repeatedly (busy waiting). Eventually, the clock runs out and some other process is scheduled to run. Sooner or later the process holding the lock gets to run and release the lock. MUTEXES
150 With user threads, the situation is different because there is no clock that stops threads that have too long. Consequently, a thread that tries to acquire a lock by busy waiting will loop forever and never acquire the lock because it never allows any other thread to run and release the lock. That is where the difference between enter_region and mutex_lock comes in. When the later fails to acquire a lock, it calls thread_yield to giveup the CPU to another thread. Consequently there is no busy waiting. MUTEXES
151 When the thread runs the next time, it tests the lock again. Since thread_yield is just a call to the thread scheduler in user space, it is very fast. As a consequence, neither mutex_lock nor mutex-unlock requires any kernel calls. Using them, user_level threads can synchronize entirely in user space using procedures that require only a handful of instructions. MUTEXES
152 Two processes that share a common address space still have different open files, alarm timers, and other per-process properties, whereas the threads within a single process share them. And it is always true that multiple processes sharing a common address space never have the efficiency of use-level threads since the kernel is deeply involved in their management. MUTEXES
153 Mutexes in Pthreads Pthreads provides a number of functions that can be used to synchronize threads. The basic mechanism uses a mutex variable, which can be locked or unlocked, to guard each critical region. A thread wishing to enter a critical region first tries to lock the associated mutex. If the mutex is unlocked, the thread can enter immediately and the lock is atomically set, preventing other threads from entering. If the mutex is already locked, the calling thread is blocked until it is unlocked. MUTEXES
154 If multiple threads are waiting on the same mutex, when it is unlocked, only one of them is allowed to continue and relock it. These locks are not mandatory. It is up to the programmer to make sure threads use them correctly. The calls for performing these operations are pthread_mutex_init and pthread_mutex-destroy, respectively. They can also be locked by pthread_mutex_lock which tries to acquire the lock and blocks if is already locked. MUTEXES
155 Some of the Pthreads calls relating to mutexes. Mutexes in Pthreads
156 There is also an option for trying to lock a mutex and failing with an error code instead of blocking if it is already blocked. This call is pthread_mutex_trylock. Finally, Pthread_mutex_unlock unlocks a mutex and releases exactly one thread if one or more are waiting on it. Mutexes can also have attributes, but these are used only for specialized purposes. MUTEXES
157 MONITORS A higher-level synchronization primitive proposed by Brinch Hansen and Hoare(1974) A monitor is a collection of procedures, variables, and data structures that are all grouped together in a special kind of module or package. Processes may call the procedures in a monitor whenever they want to, but they cannot directly access the monitors internal data structures from procedures declared outside the monitor. Monitors have an important property that makes them useful for achieving mutual exclusion: only one process can be active in a monitor at any instant.
158 MONITORS Monitors are a programming language construct, so the compiler knows they are special and can handle calls to monitor procedures differently from other procedure calls. Typically, when a process calls a monitor procedure, the first few instructions of the procedure will check to see if any other process is currently active within the monitor. If so the calling process will be suspended until the other process has left the monitor. If no other process is using the monitor the calling process may enter.
160 MONITORS It is up to the compiler to implement mutual exclusion on monitor entries, but a common way is to use a mutex or a binary semaphore. Because the compiler, not the programmer, is arranging for the mutual exclusion, it is much less likely that something will go wrong. In any event the person writing the monitor does not have to be aware of how the compiler arranges for mutual exclusion. It is sufficient to know that by turning all the critical regions into monitor procedures, no two processes will ever execute their critical regions at the same time.
161 MONITORS Although monitors provide an easy way to achieve mutual exclusion, there is a need to have a way for processes to block when they cannot proceed. In the producer-consumer problem, it is easy enough to put all the tests for buffer-full and buffer-empty in monitor procedures, but how should the producer block when it finds the buffer full?
162 The solution lies in the introduction of condition variables, along with two operations on them, wait and signal. When a monitor procedure discovers that it cannot continue (e.g., the producer finds the buffer full), it does a wait on some condition variable, say, full. This action causes the calling process to block. It also allows another process that had been previously prohibited from entering the monitor to enter now. MONITORS
163 An outline of the producer-consumer problem with monitors Monitors
164 For example the consumer can wake up its sleeping partner by doing a signal on the condition variable that its partner is waiting on. To avoid having two active processes in the monitor at the same time, we need a rule telling what happens after a signal. Hoare proposed letting the newly awakened process run, suspending the other one. Brinch Hansen proposed finessing the problem by requiring that a process doing a signal must exit the monitor immediately. In other words, a signal statement may appear only as the final statement in a monitor procedure. MONITORS
165 The automatic mutual exclusion on monitor procedure discovers that the buffer is full, it will be able to complete the wait operation without having to worry about the possibility that the scheduler may switch to the consumer just before the wait completes. The consumer will not even be let into the monitor at all until the wait is finished and the producer has been marked as no longer runnable. MONITORS
166 MESSAGE PASSING Message passing method of inter-process communication uses two primitives, send and receive, which, like semaphores and unlike monitors, are system calls rather than language constructs. As such they can easily be put into library procedures, such as Send (destination, & message); - this call sends a message to a given destination Receive (source, &message); - this call receives a message from a given source (or from ANY, if the receiver does not care). - If no message is available, the receiver can block until one arrives. Alternatively, it can return immediately with an error code.
167 MESSAGE PASSING Design Issues fro Message-Passing systems Message passing systems have many challenging problems and design issues that do not arise with semaphores or with monitors, especially if the communicating processes are on different machines connected by a network. For example, messages can be lost by the network. To guard against lost messages, sender and receiver can agree that as soon as a message has been received, the receiver will send back a special acknowledgement message.
168 MESSAGE PASSING If the sender has not received the acknowledgement within a certain time interval, it retransmits the message. Now consider what happens if the message is received correctly, but the acknowledgement back to the sender is lost. The sender will retransmit the message, so the receiver will get it twice. It is essential that the receiver be able to distinguish a new message from the retransmission of an old one. Usually this problem is solved by putting consecutive sequence numbers in each original message.
169 MESSAGE PASSING If the receiver gets a message which is a duplicate that can be ignored. Successfully communicating in the face of unreliable message passing is a major part of the study of computer networks. Message systems also have to deal with the question of how processes are named, so that the process specified in a send or receive call is unambiguous. Authentication is also an issue in message systems: How can client tell that it is communicating with the real file server, and not with an imposter?
170 MESSAGE PASSING Producer-Consumer Problem with Message passing Now let us see how the producer-consumer problem can be solved with message passing and no shared memory. We assume that all messages are the same size and that messages sent but not yet received are buffered automatically by the operating system. In this solution, a total of N messages is used, analogous to the N slots in a shared-memory buffer.
171 The producer-consumer problem with N messages. Producer-Consumer Problem with Message Passing
172 The producer-consumer problem with N messages. Producer-Consumer Problem with Message Passing
173 MESSAGE PASSING The consumer starts out by sending N empty messages to the producer, it takes an empty message and sends back a full one. In this way, the total number of messages in the system remains constant in time, so they can be stored in a given amount of memory known in advance. If the producer works faster than the consumer, all the messages will end up full, waiting for the consumer; producer will be blocked, waiting for an empty to come back. If the consumer works faster then the reverse happens: all the messages will be empties waiting for the producer to fill them up; the consumer will be blocked, waiting for a full message.
174 MESSAGE PASSING One way is to assign each process a unique address and have messages be addressed to processes. A different way is to invent a new data structure, called a mailbox. A mailbox is a place to buffer a certain number of messages, typically specified when the mailbox is created. When mailboxes are used, the address parameters in the send and receive calls are mailboxes, not processes.
175 MESSAGE PASSING When a process tries to send to a mailbox that is full, it is suspended until a message is removed from that mailbox, making room for a new one. For the producer-consumer problem, both the producer and consumer would create mailboxes large enough to hold N messages. The producer would send messages containing actual data to the consumers mailbox, and the consumer would send empty messages to the producers mailbox
176 MESSAGE PASSING Message passing is commonly used in parallel programming systems. One well-known message passing system for example, is MPI (message-passing Interface). It is widely used for scientific computing.
177 BARRIERS Our last synchronization mechanism is intended for groups of processes rather than two-process producer-consumer type situations. Some applications are divided into phases and have the rule that no process may proceed into the next phase until all processes are ready to proceed to the next phase. This behaviour may be achieved by placing a barrier at the end of each phase. When a process reaches a barrier, it is blocked until all processes have reached the barrier.
178 Use of a barrier : (a) Processes approaching a barrier. (b) All processes but one blocked at the barrier. (c) When the last process arrives at the barrier, all of them are let through. Barriers
179 BARRIERS In figure, we see four processes approaching a barrier. What this means is that they are just computing and have not reached the end of the current phase yet. After a while, the first process finishes all the computing required of it during the first phase. It then executes the barrier primitively, generally by calling a library procedure. The process is then suspended. A little later, a second and then a third process finish the first phase and also execute the barrier primitive. Finally, when the last process, C, hits the barrier, all the processes are released
Reading Assignments Chapter 4 & 5 Modern Operating Systems - Tanenbaum, 3 rd Edition, PHI, Pearson Education 180