Skip to main content Why Embedded Linux Images Split boot, rootfs, and data | IoT Worker

Why Embedded Linux Images Split boot, rootfs, and data

An embedded Linux device that can boot is not necessarily ready for product deployment.

During development, putting the bootloader, kernel, dtb, rootfs, application, and data together may work. The problems appear during updates, factory reset, abnormal power loss, partition damage, and field repair:

  • a rootfs update overwrites user data
  • the device tree changes but the bootloader still loads the old dtb
  • a full data partition breaks system services
  • rootfs is damaged and there is no recovery path
  • power loss during OTA leaves no bootable system
  • factory reset removes too much or too little

The point of partition layout is not simply “more partitions”. It separates the boot chain from data lifetimes:

bootloader: earliest startup and image selection
boot: kernel, dtb, initramfs, and boot artifacts
rootfs: system baseline
data: device runtime state and user data
factory: calibration, identity, and data that must not be casually overwritten
recovery: rescue path when the main system is broken
A/B: rollback and anti-bricking during updates

A good layout tells the system what can be replaced, what must be preserved, what can be cleared, and what is used for rescue.

The boot Partition Holds Startup Artifacts

The boot partition often stores the direct artifacts needed to start Linux:

  • kernel image
  • dtb / dtbo
  • initramfs
  • boot script
  • boot configuration
  • signatures or hashes

Some platforms place kernel and dtb in separate partitions, some use a FAT or ext4 boot partition, and some have the bootloader read raw flash offsets. The form differs, but the boundary is the same: boot holds the handoff from bootloader to the Linux kernel.

Common boot partition issues are version mismatches:

  • kernel updated but dtb is old
  • rootfs needs new kernel modules but boot still has the old kernel
  • bootloader root= points to the old rootfs
  • dtb partition descriptions do not match the real partition table
  • secure boot signs only the kernel but not the dtb or configuration

So boot is not just a place for one kernel file. It is a set of artifacts that must agree.

rootfs Is the System Baseline

rootfs contains the user-space system:

  • init or systemd
  • dynamic linker and shared libraries
  • system tools
  • service units or init scripts
  • udev rules
  • default configuration
  • applications

Product devices often make rootfs read-only because it represents the system baseline for a firmware version. Updating rootfs means replacing that baseline.

rootfs should not contain field state that cannot be lost, such as user configuration, business databases, device keys, or long-term logs. Otherwise a system update becomes a data migration problem, and factory reset has no clear boundary.

A practical rule is:

rootfs can be replaced
data should be preserved
factory should be written carefully
tmpfs can be discarded

The data Partition Stores Device State

The data partition usually stores information created after deployment that must survive reboot and update:

  • user configuration
  • application databases
  • collected data
  • logs
  • downloaded update packages
  • runtime statistics
  • user certificates
  • network configuration

data is where many field problems happen because it is writable, must tolerate power loss, and is consumed over time by logs and databases.

Design questions include:

  • what happens when the partition is full
  • whether logs are rotated and limited
  • whether databases have transactions and recovery
  • whether configuration writes are atomic
  • whether failed OTA downloads are cleaned up
  • which directories factory reset removes
  • which data migrates across versions and which can be rebuilt

Without a data boundary, applications write state into rootfs, and update, reset, and power-loss issues become entangled.

factory Is Not Ordinary data

The factory partition usually stores information that should not be casually rewritten after manufacturing:

  • device serial number
  • MAC addresses
  • calibration parameters
  • production test results
  • device certificates or key indexes
  • hardware revision data

The difference is simple: data is runtime state; factory is device identity and manufacturing baseline.

factory writes should be tightly controlled. Applications should not store ordinary configuration, logs, or caches there. OTA should not overwrite it casually. A software bug that damages identity, calibration, or credentials can make a device hard to recover.

Many systems mount factory read-only, or allow writes only through production, repair, or dedicated tools under controlled procedures.

recovery Is the Entry Point When the Main System Fails

The value of recovery is that the device still has a minimal environment when the main system cannot boot.

It can be used to:

  • rewrite rootfs
  • repair data
  • factory reset
  • accept USB, network, or serial rescue commands
  • verify and roll back damaged images
  • report failure state

Recovery does not have to be a full Linux system. It can be a bootloader recovery mode, initramfs, or a small independent rootfs.

The key is that recovery must not depend on the damaged main rootfs. Otherwise the rescue path fails together with the system it is meant to rescue.

A/B Slots Handle Interrupted Updates

The biggest OTA risk is power loss or write failure during an update. If there is only one rootfs and the update overwrites it in place, failure can leave the device without a bootable system.

A/B layout keeps two bootable systems:

slot A: boot_a + rootfs_a
slot B: boot_b + rootfs_b
data: shared or migrated by policy

If the device currently boots from A, the update writes B. After the write completes, the next boot target becomes B. If B boots and reports healthy, B is marked good. If B fails, the bootloader rolls back to A.

The key is not simply having two rootfs copies. It is the state machine:

  • current slot
  • target slot
  • remaining boot attempts
  • boot success confirmation
  • rollback condition
  • data schema migration strategy

A/B without a state machine is only duplicated files, not real anti-bricking.

Layout Must Match Bootloader, Kernel, and Device Tree

Image layout is not a standalone table. It is referenced by many places:

  • bootloader environment
  • kernel command line
  • device-tree fixed-partitions
  • initramfs scripts
  • rootfs /etc/fstab
  • systemd mount units
  • OTA tool configuration
  • production flashing scripts

If one of these disagrees, failures are hard to debug:

  • bootloader writes a new partition but the kernel mounts the old one
  • dtb offset does not match the actual image
  • /etc/fstab mounts a missing label
  • OTA writes slot B but the bootloader still boots A
  • recovery repairs a new image using an old partition table

Product images should have one trusted layout description that generates bootloader parameters, device tree, fstab, OTA configuration, and flashing scripts. Handwriting each copy invites drift.

Factory Reset Is Not Formatting the Whole Disk

Factory reset should return the device to a usable default state. It should not blindly wipe every partition.

It commonly removes:

  • user configuration
  • application databases
  • caches
  • logs
  • downloaded update packages
  • runtime state

It usually should not remove:

  • bootloader
  • current bootable system
  • factory partition
  • device identity
  • calibration parameters
  • required certificates and keys

If configuration, logs, databases, certificates, and device identity are mixed in one writable root directory, factory reset becomes dangerous. Clear partition and directory boundaries make the reset list explicit.

A Practical Layout Model

Platforms differ, but this model is useful:

bootloader / bootloader-env
boot_a
rootfs_a
boot_b
rootfs_b
data
factory
recovery

Small devices can simplify it: boot, rootfs, data, and factory may be enough without A/B. High-reliability devices may add recovery, logs, cache, metadata, verity hashes, or key partitions.

The number of partitions is not the goal. The goal is to make every data lifetime clear:

  • replaced with firmware
  • preserved with the device
  • clearable
  • rebuildable
  • used for rescue
  • used for identity and calibration

Debugging Layout Problems

For startup, update, or recovery failures, inspect this chain:

what the bootloader actually loads
-> where kernel cmdline root= points
-> whether dtb partition descriptions match
-> whether rootfs and kernel/dtb are the same version
-> whether data is mounted in the expected place
-> whether factory was accidentally written
-> whether recovery can boot independently
-> what slot the OTA state machine selects

Useful sources include:

cat /proc/cmdline
cat /proc/mounts
lsblk
blkid
findmnt
fw_printenv
journalctl -b

On MTD/NAND systems, also inspect /proc/mtd, UBI volumes, bootloader offsets, and device-tree fixed-partitions.

Image Layout Is the Skeleton of Product Reliability

Names like boot, rootfs, data, factory, recovery, and A/B look like storage details, but they define product behavior.

How the system boots, how a failed update rolls back, how user data is preserved, how factory identity is protected, how a broken main system is rescued, and what factory reset removes are all decided by these boundaries.

The earlier an embedded Linux image layout is designed, the less OTA, power-loss recovery, repair, and field debugging depend on guesswork.