“Switch to another task” sounds light, as if the CPU simply moves from one piece of code to another.
The real operation is more concrete. While the CPU is running an execution flow, registers contain intermediate state, the stack contains the call chain, the program counter points to the next instruction, and the scheduler knows whether the flow is running, ready, or blocked. To run another thread or task, the system must save the current state and restore another one.
The safest first model is this: a context switch does not switch functions. It switches the minimal state required for the CPU to continue an execution flow. Thread or task switches mainly switch registers, stack, and scheduling state. Process switches may also switch address spaces. Interrupt entry and return are a more limited form of saving and restoring execution state.
Task A is running
-> save A's CPU state
-> choose Task B
-> restore B's CPU state
-> CPU continues from where B stopped
This explains why switching is not free, and why more tasks, more locks, and finer waiting paths do not automatically make a system faster.
What Is Inside CPU Context
Context is not a mysterious object. It contains at least the state needed for the CPU to continue later.
Common parts include:
- program counter: where the next instruction is
- stack pointer: where the current stack top is
- general registers: intermediate values, parameters, return values
- status register: interrupt enable, privilege level, condition flags
- floating-point or SIMD registers: if the execution flow uses them
- scheduling state: priority, ready/blocked state, time slice
Different architectures and operating systems save different details, but the goal is the same: after restoring the state, the execution flow should continue as if it had only been paused.
That is why a thread can block in the middle of a function and later resume near the same line. The function did not keep the CPU. Its state was saved.
What a Thread Switch Usually Switches
Threads inside the same process share an address space, so a thread switch usually does not replace the whole virtual memory mapping.
But each thread has its own execution state:
- its own stack
- its own register state
- its own program counter
- its own thread-local state
- its own scheduling state
When Thread A blocks and the scheduler switches to Thread B, the operation is essentially:
save A's registers and stack position
mark A as blocked or ready
choose B
restore B's registers and stack position
mark B as running
Because the address space does not change, the threads share global variables and heap memory. That makes communication light and shared state dangerous. After the switch, B sees the same process memory.
Why a Process Switch Is Usually Heavier
A process switch often changes the address space in addition to execution state.
Two normal processes have different virtual address spaces. The same virtual address in Process A and Process B may map to completely different physical pages. When switching to another process, the system must make the CPU use another address-space description.
That adds cost:
- page-table or address-space registers may change
- TLB entries may be flushed or tagged by address-space identifiers
- CPU cache locality may degrade
- the kernel must maintain different resource and permission boundaries
So process switches are usually heavier than thread switches inside one process. The benefit is isolation: a memory bug in Process A usually does not directly corrupt Process B’s address space.
This is a common design tradeoff. Process boundaries are clearer, but communication and switching cost more. Threads share more and are lighter, but synchronization risk is higher.
Why RTOS Task Switches Look More Direct
In many small RTOSes, a task switch is closer to “save the current task’s registers and stack, then restore another task’s registers and stack.”
Many MCUs do not have a full MMU and do not create a separate virtual address space for each task. Multiple tasks share one system address space, including peripheral registers, globals, static buffers, and heap memory.
That makes task switching short and predictable, which fits real-time control.
The cost is weaker isolation:
- a bad pointer in one task may corrupt another task’s data
- a task stack overflow may corrupt system memory
- light switching does not make shared state safe
- many tasks still increase stack use and scheduling overhead
So a light RTOS task switch does not mean tasks can be split arbitrarily. Task boundaries still need to serve response paths and resource management.
How Interrupts Relate to Context Switching
When an interrupt enters, the CPU also saves state. Otherwise it could not return to the interrupted execution.
But interrupt state saving and thread/task switching are not exactly the same.
Interrupt handling usually looks like:
Current execution flow is running
-> hardware event occurs
-> CPU saves required state
-> enter ISR
-> ISR handles event
-> return to original flow, or trigger scheduling
If the ISR only handles the event and returns, the CPU may restore the original execution flow. If the ISR wakes a higher-priority task, a real task switch may happen on the return path.
Task A is running
-> interrupt occurs
-> ISR wakes Task B
-> scheduler chooses B
-> save/update A's task context
-> restore B's task context
-> return into B
So “an interrupt switched to another task” does not mean interrupt entry itself is the same as task switching. The interrupt changed the scheduling condition.
Why Context Switching Has Cost
The direct cost of a context switch is the instruction work needed to save and restore state. But that is not the whole cost.
Hidden costs include:
- worse cache locality
- lower TLB hit rate
- disrupted branch prediction state
- more lock and queue transitions
- scheduler work to maintain ready queues
- less time spent doing actual application work
If a system spends too much time switching, the CPU may look busy while useful throughput stays low.
That is why multithreading does not automatically make code faster. For CPU-bound work, far more threads than cores can make switching and cache disruption eat the benefit. For I/O-heavy work, threads can hide waiting, but task count and shared-state complexity still matter.
Voluntary and Preemptive Switches Are Different
An execution flow gives up the CPU for two broad reasons.
The first is voluntary switching. The current thread or task enters a wait:
- waits for a lock
- waits for a queue
- waits for I/O
- sleeps
- waits for a timer
- calls yield
This usually means it does not currently have the condition needed to continue.
The second is preemptive switching. The current execution flow could continue, but the system decides someone else should run:
- time slice expired
- higher-priority task became ready
- interrupt woke urgent work
- scheduling policy decides to rotate
Voluntary switches point to “what is it waiting for.” Preemptive switches point to “who was more eligible to run.” Separating them helps debugging.
What Happens If Tasks Are Split Too Fine
Splitting a system into multiple tasks can clarify structure and prevent one wait path from blocking another.
But splitting too finely creates new costs:
- every task needs a stack
- tasks need queues, locks, or events to communicate
- switches increase
- data is copied or synchronized more often
- scheduling relationships become harder to predict
- priority and deadlock analysis becomes harder
For example, a data pipeline split into eight tasks may look clean. But each packet now goes through multiple wakeups, enqueue/dequeue operations, and context switches. If each step is very short, switching cost can become visible.
A better rule is: split a separate task when the work has independent response requirements, may block for a long time, has clearly different priority, or needs to isolate a complex path.
What to Check During Context-Switch Problems
When CPU is busy but throughput is low, RTOS response jitters, or more threads do not improve performance, ask these questions.
First, is the switch voluntary or preemptive? Is the task waiting for a lock, I/O, or queue, or is it being preempted by a time slice or higher-priority work?
Second, is the switch rate abnormal? Frequent wakeups, short task chains, too many threads, or very short time slices can increase switching.
Third, is the task doing useful work? Is CPU time spent in scheduling, lock contention, and queue movement instead of application work?
Fourth, is the address space changing frequently? High-frequency process switching can add more TLB and cache cost.
Fifth, are RTOS task stacks sized reasonably? Too many tasks with large stacks waste RAM. Stacks that are too small cause hard-to-debug memory corruption.
Sixth, are interrupts frequently preempting tasks? High interrupt rate can make tasks appear constantly interrupted.
This is more useful than just saying “too many context switches.” The real target is the wait, wakeup, or preemption path causing unnecessary switching.
What to Remember in Practice
A context switch does not switch business concepts. It switches the state needed for the CPU to continue execution.
Thread or task switches usually save and restore registers, stack pointer, program counter, and scheduling state. Process switches may also change the address space. Interrupt entry and return also save required state, but they switch to another task only if scheduling conditions change.
Context switching lets multiple execution flows share the CPU, but it has cost. More tasks, finer waiting paths, more frequent wakeups, and more shared state make switching and scheduling complexity more visible.
When designing tasks and threads, do not only ask “can this be split.” Ask whether the logic really needs independent response, an independent blocking path, or an isolation boundary.