I am asked to describe the steps involved in a context switch (1) between two different processes and (2) between two different threads in the same process.
It's much easier to explain those in reverse order because a process-switch always involves a thread-switch.
A typical thread context switch on a single-core CPU happens like this:
All context switches are initiated by an 'interrupt'. This could be an actual hardware interrupt that runs a driver, (eg. from a network card, keyboard, memory-management or timer hardware), or a software call, (system call), that performs a hardware-interrupt-like call sequence to enter the OS. In the case of a driver interrupt, the OS provides an entry point that the driver can call instead of performing the 'normal' direct interrupt-return & so allows a driver to exit via the OS scheduler if it needs the OS to set a thread ready, (eg. it has signaled a semaphore).
Non-trivial systems will have to initiate a hardware-protection-level change to enter a kernel-state so that the kernel code/data etc. can be accessed.
Core state for the interrupted thread has to be saved. On a simple embedded system, this might just be pushing all registers onto the thread stack and saving the stack pointer in its Thread Control Block (TCB).
Many systems switch to an OS-dedicated stack at this stage so that the bulk of OS-internal stack requirements are not inflicted on the stack of every thread.
It may be necessary to mark the thread stack position where the change to interrupt-state occurred to allow for nested interrupts.
The driver/system call runs and may change the set of ready threads by adding/removing TCB's from internal queues for the different thread priorities, eg. network card driver may have set an event or signaled a semaphore that another thread was waiting on, so that thread will be added to the ready set, or a running thread may have called sleep() and so elected to remove itself from the ready set.
The OS scheduler algorithm is run to decide which thread to run next, typically the highest-priority ready thread that is at the front of the queue for that priority. If the next-to-run thread belongs to a different process to the previously-run thread, some extra stuff is needed here, (see later).
The saved stack pointer from the TCB for that thread is retrieved and loaded into the hardware stack pointer.
The core state for the selected thread is restored. On my simple system, the registers would be popped from the stack of the selected thread. More complex systems will have to handle a return to user-level protection.
An interrupt-return is performed, so transferring execution to the selected thread.
In the case of a multicore CPU, things are more complex. The scheduler may decide that a thread that is currently running on another core may need to be stopped and replaced by a thread that has just become ready. It can do this by using its interprocessor driver to hardware-interrupt the core running the thread that has to be stopped. The complexities of this operation, on top of all the other stuff, is a good reason to avoid writing OS kernels :)
A typical process context switch happens like this:
Process context switches are initiated by a thread-context switch, so all of the above, 1-9, is going to need to happen.
At step 5 above, the scheduler decides to run a thread belonging to a different process from the one that owned the previously-running thread.
The memory-management hardware has to be loaded with the address-space for the new process, ie whatever selectors/segments/flags/whatever that allow the thread/s of the new process to access its memory.
The context of any FPU hardware needs to be saved/restored from the PCB.
There may be other process-dedicated hardware that needs to be saved/restored.
On any real system, the mechanisms are architecture-dependent and the above is a rough and incomplete guide to the implications of either context switch. There are other overheads generated by a process-switch that are not strictly part of the switch - there may be extra cache-flushes and page-faults after a process-switch since some of its memory may have been paged out in favour of pages belonging to the process owning the thread that was running before.
1.Save the context of the process that is currently running on the CPU. Update the process control block and other important fields.
2.Move the process control block of the above process into the relevant queue such as the ready queue, I/O queue etc.
3.Select a new process for execution.
4.Update the process control block of the selected process. This includes updating the process state to running.
5.Update the memory management data structures as required.
6.Restore the context of the process that was previously running when it is loaded again on the processor. This is done by loading the previous values of the process control block and registers.
(Source: Context switch)
I hope that I can provide a more detailed/clear picture.
First of all, the OS schedules threads, not processes, because threads are the only executable units in the system. Process switch is just a thread switch where the threads belong to different processes, and therefore the procedure is basically the same.
The scheduler is invoked. There are three basic scenarios in which this may happen:
In all cases, to be able to perform a context switch, control should be passed to the kernel. In the case of involuntary switches, this is performed by an interrupt. In the case of voluntary (and semi-voluntary) context switches, control is passed to the kernel via a system call.
In both cases, kernel entry is CPU-assisted. The processor performs a permissions check, saves the instruction pointer (so that execution can be continued from the right instruction later), switches from user user mode to kernel mode, activates the kernel stack (specific to the current thread) and jumps to a predefined and well-known point in the kernel code.
The first action performed by the kernel is saving the content of CPU registers, which it needs to use for its own purposes. Usually the kernel uses only general purpose CPU registers and saves them by pushing them onto the stack.
The kernel then handles a primary request if needed. It may handle an interrupt, prepare a file read request, reload a timer etc.
At some point during request handling, the kernel performs an action that affects the state of either the current thread (decided that there is currently nothing to be done in this thread as it is waiting for something) or that of another thread (or threads) (a thread became ready to run because an event it was waiting for occurred - a mutex was released, for example).
The kernel invokes the scheduller. The scheduler has to make made two decisions.
Once both decisions have been made, the scheduler performs the context switch using the TCB of the current thread as well as that of the thread that is to be run next.
A context switch itself consist of three main steps.
At this point the kernel checks if the scheduled and unscheduled threads belong to the same process. If not ("process" rather than "thread" switch), the kernel resets the current address space by pointing the MMU (Memory Management Unit) to the page table of the scheduled process. The TLB (Translation Lookaside Buffer), which is a cache containing recent virtual to physical address translations, is also flushed to prevent erroneous address translation. Note that this is the only step in the entire set of context switch actions that cares about processes!
The kernel prepares Thread Local Storage for the scheduled thread. For example, it maps respective memory pages to the specified addresses. As another example, on the IA-32 platform a common approach is to load a new segment which point to the TLS data of the incoming thread.
The kernel loads the current thread's kernel stack address into the CPU. After this, every kernel invocation will use this kernel stack instead of the kernel stack of the unscheduled thread.
Another step which may be performed by the kernel is reprogramming the system timer. When the timer fires, control is returned to the kernel. The time period between the context switch and the timer firing is called a time quantum and indicates how much execution time the current thread is given at that time. This is known as pre-emptive scheduling.
Kernels usually collect statistics during context switches to improve scheduling as well as to show system administrators and users what is going on in the system. These statistics may include such information as how much CPU time the thread has consumed, how many times it has been scheduled, how many times its time quantum has expired, how frequently context switches are occurring in the system etc.
The context switch can be considered ready at this point, and the kernel continues previously interrupted system actions. For example, if the thread had tried to acquire a mutex during a system call, and the mutex is now free, the kernel may finish the interrupted operation.
At some point the thread finishes its system activities and wants to return back to user mode to execute non-system code. The kernel pops from the kernel stack content of general-purpose registers which was previously saved upon kernel entry and makes the CPU execute a special instruction to return to user mode.
The CPU captures the values of the instruction pointer and stack pointer, which were previously saved kernel mode was entered, and restores them. At this point the thread's user mode stack is also activated and kernel mode exited (this prohibits the use of special system instructions).
Finally, the CPU continues execution from the point where the thread was when it was unscheduled. If it happened during a system clal, the thread will proceed from the point where the system call was invoked, by capturing and handling its result. In the case of pre-emption by interrupt, the thread will continue its execution as if nothing happened.
Some summary notes:
The kernel only schedules and executes threads, not processes - context switches take place between threads.
The procedure of switching to the context of a thread from another process is essentially the same in a context switch between threads belonging to the same process. Only one additional step is required: changing page tables (and flushing the TLB).
Thread context is stored either in kernel stack or in the TCB (not PCB!).
Context switching is an expensive operation - it has a significant direct cost in performance, and the indirect cost caused by cache pollution (and TLB flush if the switch occurred between processes) is even greater.