RTOS | IoT Worker

Why RTOS tick, Software Timers, and Timeouts Are Not Real Time

6 minute

RTOS code often contains calls like:

vTaskDelay(1);
xQueueReceive(q, &msg, 10);
xTimerStart(timer, 0);

They all look like “wait for some time”. But RTOS time is usually not continuous, precise, and immediately executed. Many failures come from reading tick, delay, timeout, and software timers as precise real-time behavior.

A safer model is:

hardware clock generates tick
-> RTOS updates tick count
-> expired tasks or timers become ready/pending
-> scheduler decides when code actually runs
-> low power and interrupt masking change this path

So delay 10 ms is closer to “do not run until at least a tick boundary and then wait for scheduling” than “this code will run exactly 10 ms later”.

How to Choose RTOS Queues, Semaphores, Mutexes, and Event Groups

6 minute

RTOSes provide many synchronization primitives: queues, semaphores, mutexes, event groups, and task notifications. They can all appear to mean “make one task wait for another”, so they are easy to mix up.

That may work briefly, but field failures become hard to reason about:

events are occasionally lost
a high-priority task is delayed by a low-priority task
a full queue stalls the system
ISR code calls an API that can block
combined conditions are checked incorrectly
data is updated, but the consumer sees stale state

The first selection model is:

Why RTOS Task Stacks Often Cause Field Issues

7 minute

When an RTOS device resets in the field, hits a random HardFault, jumps into invalid code, or corrupts queue data, engineers often suspect pointers, concurrency, peripherals, or power.

Those are valid suspects, but one common cause is simpler: a task stack is too small.

RTOS tasks usually do not have a large process address space or strong isolation like Linux processes. Each task owns a stack region whose size is often fixed at task creation. Make it too large and RAM is wasted. Make it too small and the failure may not happen immediately; the stack may silently overwrite nearby memory.

What Happens From Power-On to Application Start?

5 minute

When a device boots slowly, an application does not start, a driver does not load, or a network service fails, people often jump straight to application logs.

But the application is only the last part of the boot chain. After power-on, the CPU does not directly jump to business logic. It starts from a fixed entry, initializes the minimal hardware environment, finds the next-stage image, loads an OS or RTOS, initializes memory, devices, and scheduling, and only then reaches the application.

RTOS vs Linux Is Not Just About Size

7 minute

When comparing an RTOS and Linux, people often start with an intuitive difference: an RTOS is small, Linux is large.

That is true, but too coarse. What affects engineering choices and debugging is not binary size alone. It is the different problems they are designed to solve by default.

An RTOS is common in resource-constrained device-side systems with clear response paths and fixed control cycles. Linux is common when resources are richer and the system needs process isolation, network stacks, filesystems, complex drivers, and application ecosystems.

What Does a Context Switch Actually Switch?

8 minute

“Switch to another task” sounds light, as if the CPU simply moves from one piece of code to another.

The real operation is more concrete. While the CPU is running an execution flow, registers contain intermediate state, the stack contains the call chain, the program counter points to the next instruction, and the scheduler knows whether the flow is running, ready, or blocked. To run another thread or task, the system must save the current state and restore another one.

Why the Scheduler Decides System Responsiveness

9 minute

Many performance problems first look like “the CPU is too slow”: a button responds late, UART handling misses data, network packets pile up, the UI stalls, or sensor data reaches the application thread late.

But CPU load is only one layer. What often decides responsiveness is when the code gets to run.

Code existing in memory does not mean it can run at any moment. It may not be ready yet. It may be waiting for a lock, queue, or I/O. It may be blocked behind a higher-priority task. It may have just been woken by an interrupt but not yet selected by the scheduler.

What Is the Difference Between a Process, a Thread, and an RTOS Task?

9 minute

Many concurrency bugs start with one overloaded word: task.

On Linux, people say “start a process.” In application frameworks, they say “create a thread.” In an RTOS, they often say “create a task.” All three sound like “make some code run at the same time,” but their resource boundaries are very different.

If they are only understood as “units of code execution,” engineering judgment quickly goes wrong. Why can one thread crash an entire process? Why can two processes not directly access each other’s variables? Why is sharing global variables between RTOS tasks so common? Why are Linux process switches and thread switches not exactly the same cost?