Skip to main content Why the RTOS Path From ISR to Task Must Be Designed | IoT Worker

Why the RTOS Path From ISR to Task Must Be Designed

In RTOS devices, many peripheral issues are not about whether an interrupt entered. They are about whether the task handled it correctly and soon enough after the ISR.

Common symptoms include:

  • ISR logs appear, but the business task receives no data
  • queues fill up when interrupt rate increases
  • ISR does too much work and system response gets worse
  • a high-priority task is woken but runs late
  • normal APIs are called from ISR and cause assertions or deadlocks
  • one notification is sent while the hardware FIFO already contains many events

Do not look only at the ISR function. The full path is:

hardware event
-> ISR acknowledges event, saves minimum state, clears source
-> ISR notifies a task or posts a message
-> scheduler switch is requested when needed
-> task handles complex logic in schedulable context

If the ISR-to-task path is wrong, fast interrupt entry still leads to slow response, lost events, or deadlock.

ISR Should Do the Minimum Required Work

An ISR runs in a special context. It usually cannot block, wait on mutexes, wait for queues, do I/O, allocate from a blocking heap, or format complex logs.

It fits:

  • read interrupt status
  • save necessary registers or small data
  • clear or acknowledge the interrupt source
  • post the event to a task
  • wake the deferred processing path

It does not fit:

  • protocol parsing
  • large memcpy
  • file writes
  • network requests
  • waiting on locks
  • complex log formatting
  • long loops that process all business logic

The ISR’s goal is not to finish the work. It is to stop the hardware from repeatedly interrupting for the same event and hand the rest to task context.

Save the Event Before Clearing the Source

Many peripheral interrupts come from status bits, FIFO state, DMA completion bits, or error flags. ISR ordering matters.

A common shape is:

read status
-> save required information
-> clear interrupt source
-> notify task

If status is cleared before data is captured, evidence may be lost. If the source is not cleared, the ISR may immediately re-enter. If the task is notified without saved state, the hardware state may already have changed by the time the task runs.

The task may run much later than the ISR. The ISR must save enough information for the task to know what happened.

That information may be:

  • status snapshot
  • a small amount of data from FIFO
  • DMA buffer index
  • error counters
  • timestamp or tick
  • event counter

Do not make the task rely only on reading current hardware state later. That is no longer the state at interrupt time.

Notification Method Must Match Event Semantics

Common ISR-to-task mechanisms include:

task notification: lightweight wakeup of one fixed task
semaphore: event occurrence or count
queue: event data or buffer pointer
event group: condition bits
stream/message buffer: continuous data

Choose by semantics.

If the ISR only wakes one fixed handler task, task notification is often the lightest.

If every interrupt must be counted, use a counting semaphore or notification count.

If data, buffer pointers, or error codes must be transferred, a queue is clearer.

If the ISR updates state such as “device ready”, “link up”, or “DMA idle”, event bits fit better.

Do not use one binary semaphore for events that must be handled one by one. Consecutive interrupts can collapse into one wakeup, so the task only knows “something happened”, not how many times.

Use ISR-Safe APIs From ISR

RTOSes usually provide ISR-safe APIs, often named with FromISR or a similar suffix. They do not block and modify kernel objects in a way that is valid from interrupt context.

Calling normal task-context APIs from ISR can cause:

  • assertions
  • attempts to block
  • invalid lock semantics
  • scheduler state corruption
  • critical-section nesting errors

ISR-safe APIs often return a flag indicating whether a higher-priority task was woken. Before leaving the ISR, the code should request a context switch based on that flag.

Otherwise the high-priority task may be ready but not run until the next tick or scheduling point, increasing response latency.

Woken Does Not Mean Running

After ISR notification, a task usually changes from blocked to ready. It still needs the scheduler to run it.

Factors include:

  • priority of the woken task
  • whether preemption is enabled
  • whether the ISR requested a scheduling switch
  • whether an even higher-priority task is ready
  • long interrupt-disabled or scheduler-locked sections
  • whether the target task immediately blocks on another lock or queue

Separate three delays:

interrupt latency: event to ISR entry
ISR time: how long ISR itself runs
scheduling latency: task wakeup to task execution

Measuring ISR entry time alone does not prove business response time.

Deferred Work Must Handle Bursts

Moving work from ISR to task is correct, but the deferred task must handle bursts.

Common issues:

  • queue too small
  • handler task priority too low
  • each batch processes too little
  • ISR produces events faster than task consumes
  • no policy when queue is full
  • errors and normal events compete in one channel

For bursty peripherals, decide:

  • what happens when the queue is full
  • whether old data may be dropped
  • whether only the latest state is enough
  • whether errors must never be dropped
  • how much work the task handles per wakeup
  • whether backpressure or temporary interrupt masking is needed

An ISR-to-task path without overload policy only works at low load.

High-Frequency Interrupts Need Lower Per-Event Cost

At high interrupt rates, fixed ISR cost is amplified.

Avoid:

  • complex logs on every interrupt
  • dynamic allocation on every interrupt
  • posting large messages every time
  • waking a task for one byte at a time
  • heavy locks or long critical sections

Better patterns include:

  • batch FIFO reads
  • use DMA to reduce interrupt rate
  • store data in ring buffers
  • ISR only updates write index
  • task consumes in batches
  • keep counters for drops and overruns
  • temporarily mask interrupt and let task re-enable it if needed

The key is not only making every ISR fast. It is reducing interrupt count, reducing fixed per-interrupt cost, and letting the task process in batches.

Shared Data Needs Real Protection

ISR and task often share state: ring buffer indexes, status bits, counters, DMA buffer ownership.

volatile alone is not synchronization. Choose based on access pattern:

  • atomics
  • short critical sections
  • masking the specific interrupt
  • single-producer single-consumer ring rules
  • memory barriers or cache maintenance
  • mutexes only in task context, not blocking from ISR

A common safe model is: ISR writes only the producer position; task writes only the consumer position; both read the other side under clear rules. This reduces locks and interrupt masking.

If shared state needs long protection, the ISR is probably doing too much and more work should move to the task.

Debugging Order

When “the interrupt happens but business logic does not react”, inspect:

hardware event exists
-> ISR entered
-> interrupt source cleared correctly
-> ISR saved required state
-> notification API is ISR-safe
-> context switch was requested when needed
-> task moved from blocked to ready
-> task actually ran
-> queue/notification overflow or coalescing happened
-> deferred task consumed fast enough

If the issue appears only at higher interrupt rate, focus on queue depth, drop counters, ISR execution time, task priority, and scheduling latency.

ISR to Task Is a Complete Data Path

RTOS interrupt design does not end with a short ISR. ISR is the top half, task handling is the bottom half, and between them are notification objects, scheduler behavior, queue capacity, shared data, and overload policy.

A stable design says what the ISR saves, what it clears, whom it notifies, which primitive it uses, whether it requests an immediate switch, how the task batches work, and what gets dropped under overload.

Once that path is explicit, “the interrupt entered but the application did nothing” becomes measurable and debuggable.