Skip to main content How to Choose RTOS Queues, Semaphores, Mutexes, and Event Groups | IoT Worker

How to Choose RTOS Queues, Semaphores, Mutexes, and Event Groups

RTOSes provide many synchronization primitives: queues, semaphores, mutexes, event groups, and task notifications. They can all appear to mean “make one task wait for another”, so they are easy to mix up.

That may work briefly, but field failures become hard to reason about:

  • events are occasionally lost
  • a high-priority task is delayed by a low-priority task
  • a full queue stalls the system
  • ISR code calls an API that can block
  • combined conditions are checked incorrectly
  • data is updated, but the consumer sees stale state

The first selection model is:

queue: transfer data and ownership
semaphore: represent events or resource counts
mutex: protect shared state
event group: represent combinations of conditions
task notification: lightweight notification to one task

Choosing the wrong primitive usually means mixing data, events, resources, state, and condition combinations.

queue Transfers Data and Ownership

A queue is best when a producer hands a message to a consumer.

It answers two questions:

  • where the data goes
  • who owns it now

Typical uses include:

  • ISR posts a small message to a task
  • communication task sends parsed results to business logic
  • sampling task sends data to storage task
  • multiple producers queue work to one consumer

The advantage is that data and wakeup are connected. When the consumer wakes, it knows which message caused it.

Boundaries:

  • large messages add copy and memory pressure
  • full queues need a drop, overwrite, block, or error policy
  • queues do not protect long-lived shared state
  • queues are not ideal for complex condition combinations

If there is no data to transfer and only a wakeup is needed, a queue may be too heavy.

semaphore Represents Events or Counts

A semaphore fits “something happened” or “N resources are available”.

A binary semaphore often signals an event:

ISR or task gives
-> target task takes
-> target task handles event

A counting semaphore is useful for resource counts, such as buffers, connection slots, or tokens.

The important limitation is that a semaphore usually carries no data. After wakeup, the task must read state somewhere else. If that shared state is not protected, races appear.

A common misuse is treating a semaphore as a mutex. A plain semaphore may not have ownership semantics or priority inheritance. A high-priority task can be blocked behind a low-priority task while unrelated medium-priority tasks keep running.

Use a mutex for shared state. Use a semaphore for events or resource counts.

mutex Protects Shared State

A mutex means only one execution context may access a shared state at a time.

It fits:

  • shared structures
  • device state machines
  • configuration objects
  • statistics
  • driver or protocol stack internal state

The engineering difference from a semaphore is ownership and, often, priority inheritance.

That matters in RTOS systems. If a high-priority task waits for a lock held by a low-priority task, priority inheritance can temporarily boost the lock holder so it can release the mutex sooner.

Mutex boundaries:

  • do not take a blocking mutex in ISR context
  • keep lock hold time short
  • do not hold a mutex while waiting for queues, I/O, or complex callbacks
  • keep lock ordering fixed to avoid deadlocks
  • do not use a mutex to mean “an event happened”

Mutexes protect state. They do not transfer data or count events.

event group Represents Condition Combinations

An event group is useful when multiple condition bits need to be combined.

For example, a communication task may wait for:

network connected
time synchronized
certificate loaded
configuration ready

These conditions can be represented as bits. A task can wait for one or for all of them.

Event groups are good at:

  • wait for any event
  • wait for all events
  • inspect the current condition set
  • clear consumed bits

They do not carry data and do not count an unbounded number of occurrences. A bit represents state or condition, not the full history of each event.

If every occurrence must be handled, use a queue or counting semaphore. If only current readiness matters, an event group is a better fit.

task notification Is a Lightweight Direct Signal

Many RTOSes provide task notifications. The notification state is usually embedded in the task control block, so no separate queue or semaphore object is required.

It fits:

  • one ISR waking one fixed task
  • one producer notifying one consumer
  • lightweight event flags
  • simple counts
  • replacement for a simple binary or counting semaphore

The advantage is speed and low memory overhead.

Its boundaries are also clear:

  • usually targets one task
  • not a multi-consumer queue
  • not for complex data transfer
  • not for multiple tasks competing for one resource
  • overwrite, count, or bit semantics must be understood

Task notification is excellent on hot paths, but do not make system relationships unreadable just to save an object.

ISR-Safe APIs Matter

RTOS APIs usually distinguish task-context APIs from ISR-safe APIs.

An ISR cannot block or call APIs that may sleep. It should usually:

  • read minimum hardware state
  • clear the interrupt source
  • post a message, give a semaphore, or notify a task
  • request a context switch when needed

When notifying from ISR, check:

  • whether the ISR-safe API is used
  • whether a context switch should be requested
  • what happens if a queue is full or notification is overwritten
  • whether ISR and task share data
  • whether the backend task can keep up with interrupt rate

The ISR-to-task path determines response latency, event loss policy, and system pressure.

Common Misuses

First, using a semaphore as a mutex. Without ownership and priority inheritance, real-time latency can become unpredictable.

Second, using an event bit for events that must be processed one by one. Bits store current condition, not full history.

Third, sending large objects through queues. Copy cost, stack pressure, heap pressure, and full-queue policy all matter.

Fourth, holding a mutex while doing blocking I/O. The lock duration becomes unbounded.

Fifth, doing too much synchronization and business logic inside ISR. Interrupt time grows and scheduling latency suffers.

Sixth, using a global variable plus a semaphore. The semaphore wakes the task, but the shared variable may still be unprotected.

Selection Order

Ask:

Need to transfer data to another task?
-> queue

Only need to signal an event or count resources?
-> semaphore

Need to protect shared state?
-> mutex

Need to wait for a combination of conditions?
-> event group

Only need a lightweight wakeup for one fixed task?
-> task notification

Then check:

  • can it be called from ISR
  • is priority inheritance needed
  • can events be lost
  • must repeated events be handled individually
  • is data ownership clear
  • what happens on queue full or resource exhaustion
  • is there a timeout and fallback path

Synchronization primitives are not only about waiting. They express the relationships in the system.

Field Debugging Order

For RTOS stalls, lost events, or slow response, inspect:

which object the task is waiting on
-> whether it represents data, event, resource, state, or condition
-> whether the primitive is mismatched
-> whether ISR notification is correct
-> whether queues fill or signals are lost
-> whether priority inversion exists
-> whether locks are held across blocking operations
-> whether timeout and error handling exist

If a system can only be fixed by adding more semaphores, it usually means data ownership, event semantics, and shared state protection are not separated.

RTOS synchronization is not about API names. Use queues for data, semaphores for events and counts, mutexes for shared state, event groups for condition combinations, and task notifications for lightweight direct wakeups.