U-Boot Standard Boot

Introduction

Standard boot provides a built-in way for U-Boot to automatically boot an Operating System without custom scripting and other customisation. It introduces the following concepts:

  • bootdev - a device which can hold or access a distro (e.g. MMC, Ethernet)

  • bootmeth - a method to scan a bootdev to find bootflows (e.g. distro boot)

  • bootflow - a description of how to boot (provided by the distro)

For Linux, the distro (Linux distribution, e.g. Debian, Fedora) is responsible for creating a bootflow for each kernel combination that it wants to offer. These bootflows are stored on media so they can be discovered by U-Boot. This feature is typically called distro boot (see Generic Distro Configuration Concept) because it is a way for distributions to boot on any hardware.

Traditionally U-Boot has relied on scripts to implement this feature. See distro_bootcmd for details. This is done because U-Boot has no native support for scanning devices. While the scripts work remarkably well, they can be hard to understand and extend, and the feature does not include tests. They are also making it difficult to move away from ad-hoc CONFIGs, since they are implemented using the environment and a lot of #defines.

Standard boot is a generalisation of distro boot. It provides a more built-in way to boot with U-Boot. The feature is extensible to different Operating Systems (such as Chromium OS) and devices (beyond just block and network devices). It supports EFI boot and EFI bootmgr too.

Finally, standard boot supports the operation of Verified Boot for Embedded (VBE).

Bootflow

A bootflow is a file that describes how to boot a distro. Conceptually there can be different formats for that file but at present U-Boot only supports the BootLoaderSpec format. which looks something like this:

menu autoboot Welcome to Fedora-Workstation-armhfp-31-1.9. Automatic boot in # second{,s}. Press a key for options.
menu title Fedora-Workstation-armhfp-31-1.9 Boot Options.
menu hidden

label Fedora-Workstation-armhfp-31-1.9 (5.3.7-301.fc31.armv7hl)
    kernel /vmlinuz-5.3.7-301.fc31.armv7hl
    append ro root=UUID=9732b35b-4cd5-458b-9b91-80f7047e0b8a rhgb quiet LANG=en_US.UTF-8 cma=192MB cma=256MB
    fdtdir /dtb-5.3.7-301.fc31.armv7hl/
    initrd /initramfs-5.3.7-301.fc31.armv7hl.img

As you can see it specifies a kernel, a ramdisk (initrd) and a directory from which to load devicetree files. The details are described in distro_bootcmd.

The bootflow is provided by the distro. It is not part of U-Boot. U-Boot’s job is simply to interpret the file and carry out the instructions. This allows distros to boot on essentially any device supported by U-Boot.

Typically the first available bootflow is selected and booted. If that fails, then the next one is tried.

Bootdev

Where does U-Boot find the media that holds the operating systems? That is the job of bootdev. A bootdev is simply a layer on top of a media device (such as MMC, NVMe). The bootdev accesses the device, including partitions and filesystems that might contain things related to an operating system.

For example, an MMC bootdev provides access to the individual partitions on the MMC device. It scans through these to find filesystems, then provides a list of these for consideration.

Bootmeth

Once the list of filesystems is provided, how does U-Boot find the bootflow files in these filesystems. That is the job of bootmeth. Each boot method has its own way of doing this.

For example, the distro bootmeth simply looks through the provided filesystem for a file called extlinux/extlinux.conf. This files constitutes a bootflow. If the distro bootmeth is used on multiple partitions it may produce multiple bootflows.

Note: it is possible to have a bootmeth that uses a partition or a whole device directly, but it is more common to use a filesystem.

Note that some bootmeths are ‘global’, meaning that they select the bootdev themselves. Examples include VBE and EFI boot manager. In this case, they provide a read_bootflow() method which checks whatever bootdevs it likes, then returns the bootflow, if found. Some of these bootmeths may be very slow, if they scan a lot of devices.

Boot process

U-Boot tries to use the ‘lazy init’ approach whereever possible and distro boot is no exception. The algorithm is:

while (get next bootdev)
   while (get next bootmeth)
       while (get next bootflow)
           try to boot it

So U-Boot works its way through the bootdevs, trying each bootmeth in turn to obtain bootflows, until it either boots or exhausts the available options.

Instead of 500 lines of #defines and a 4KB boot script, all that is needed is the following command:

bootflow scan -lb

which scans for available bootflows, optionally listing each find it finds (-l) and trying to boot it (-b).

When global bootmeths are available, these are typically checked before the above bootdev scanning.

Controlling ordering

Several options are available to control the ordering of boot scanning:

boot_targets

This environment variable can be used to control the list of bootdevs searched and their ordering, for example:

setenv boot_targets "mmc0 mmc1 usb pxe"

Entries may be removed or re-ordered in this list to affect the boot order. If the variable is empty, the default ordering is used, based on the priority of bootdevs and their sequence numbers.

bootmeths

This environment variable can be used to control the list of bootmeths used and their ordering for example:

setenv bootmeths "syslinux efi"

Entries may be removed or re-ordered in this list to affect the order the bootmeths are tried on each bootdev. If the variable is empty, the default ordering is used, based on the bootmeth sequence numbers, which can be controlled by aliases.

The bootmeth command (bootmeth order) operates in the same way as setting this variable.

Bootdev uclass

The bootdev uclass provides an simple API call to obtain a bootflows from a device:

int bootdev_get_bootflow(struct udevice *dev, struct bootflow_iter *iter,
                         struct bootflow *bflow);

This takes a iterator which indicates the bootdev, partition and bootmeth to use. It returns a bootflow. This is the core of the bootdev implementation. The bootdev drivers that implement this differ depending on the media they are reading from, but each is responsible for returning a valid bootflow if available.

A helper called bootdev_find_in_blk() makes it fairly easy to implement this function for each media device uclass, in a few lines of code.

Bootdev drivers

A bootdev driver is typically fairly simple. Here is one for mmc:

static int mmc_get_bootflow(struct udevice *dev, struct bootflow_iter *iter,
                struct bootflow *bflow)
{
    struct udevice *mmc_dev = dev_get_parent(dev);
    struct udevice *blk;
    int ret;

    ret = mmc_get_blk(mmc_dev, &blk);
    /*
     * If there is no media, indicate that no more partitions should be
     * checked
     */
    if (ret == -EOPNOTSUPP)
        ret = -ESHUTDOWN;
    if (ret)
        return log_msg_ret("blk", ret);
    assert(blk);
    ret = bootdev_find_in_blk(dev, blk, iter, bflow);
    if (ret)
        return log_msg_ret("find", ret);

    return 0;
}

static int mmc_bootdev_bind(struct udevice *dev)
{
    struct bootdev_uc_plat *ucp = dev_get_uclass_plat(dev);

    ucp->prio = BOOTDEVP_0_INTERNAL_FAST;

    return 0;
}

struct bootdev_ops mmc_bootdev_ops = {
    .get_bootflow    = mmc_get_bootflow,
};

static const struct udevice_id mmc_bootdev_ids[] = {
    { .compatible = "u-boot,bootdev-mmc" },
    { }
};

U_BOOT_DRIVER(mmc_bootdev) = {
    .name        = "mmc_bootdev",
    .id        = UCLASS_BOOTDEV,
    .ops        = &mmc_bootdev_ops,
    .bind        = mmc_bootdev_bind,
    .of_match    = mmc_bootdev_ids,
};

The implementation of the get_bootflow() method is simply to obtain the block device and call a bootdev helper function to do the rest. The implementation of bootdev_find_in_blk() checks the partition table, and attempts to read a file from a filesystem on the partition number given by the @iter->part parameter.

Each bootdev has a priority, which indicates the order in which it is used. Faster bootdevs are used first, since they are more likely to be able to boot the device quickly.

Device hierarchy

A bootdev device is a child of the media device. In this example, you can see that the bootdev is a sibling of the block device and both are children of media device:

mmc           0  [ + ]   bcm2835-sdhost        |   |-- mmc@7e202000
blk           0  [ + ]   mmc_blk               |   |   |-- mmc@7e202000.blk
bootdev       0  [   ]   mmc_bootdev           |   |   `-- mmc@7e202000.bootdev
mmc           1  [ + ]   sdhci-bcm2835         |   |-- sdhci@7e300000
blk           1  [   ]   mmc_blk               |   |   |-- sdhci@7e300000.blk
bootdev       1  [   ]   mmc_bootdev           |   |   `-- sdhci@7e300000.bootdev

The bootdev device is typically created automatically in the media uclass’ post_bind() method by calling bootdev_setup_for_dev(). The code typically something like this:

ret = bootdev_setup_for_dev(dev, "eth_bootdev");
if (ret)
    return log_msg_ret("bootdev", ret);

Here, eth_bootdev is the name of the Ethernet bootdev driver and dev is the ethernet device. This function is safe to call even if standard boot is not enabled, since it does nothing in that case. It can be added to all uclasses which implement suitable media.

The bootstd device

Standard boot requires a single instance of the bootstd device to make things work. This includes global information about the state of standard boot. See struct bootstd_priv for this structure, accessed with bootstd_get_priv().

Within the devicetree, if you add bootmeth devices, they should be children of the bootstd device. See arch/sandbox/dts/test.dts for an example of this.

Automatic devices

It is possible to define all the required devices in the devicetree manually, but it is not necessary. The bootstd uclass includes a dm_scan_other() function which creates the bootstd device if not found. If no bootmeth devices are found at all, it creates one for each available bootmeth driver.

If your devicetree has any bootmeth device it must have all of them that you want to use, since no bootmeth devices will be created automatically in that case.

Using devicetree

If a bootdev is complicated or needs configuration information, it can be added to the devicetree as a child of the media device. For example, imagine a bootdev which reads a bootflow from SPI flash. The devicetree fragment might look like this:

spi@0 {
    flash@0 {
        reg = <0>;
        compatible = "spansion,m25p16", "jedec,spi-nor";
        spi-max-frequency = <40000000>;

        bootdev {
            compatible = "u-boot,sf-bootdev";
            offset = <0x2000>;
            size = <0x1000>;
        };
    };
};

The sf-bootdev driver can implement a way to read from the SPI flash, using the offset and size provided, and return that bootflow file back to the caller. When distro boot wants to read the kernel it calls distro_getfile() which must provide a way to read from the SPI flash. See distro_boot() at distro_boot for more details.

Of course this is all internal to U-Boot. All the distro sees is another way to boot.

Configuration

Standard boot is enabled with CONFIG_BOOTSTD. Each bootmeth has its own CONFIG option also. For example, CONFIG_BOOTMETH_DISTRO enables support for distro boot from a disk.

Available bootmeth drivers

Bootmeth drivers are provided for:

  • distro boot from a disk (syslinux)

  • distro boot from a network (PXE)

  • EFI boot using bootefi

  • VBE

  • EFI boot using boot manager

Command interface

Three commands are available:

bootdev

Allows listing of available bootdevs, selecting a particular one and getting information about it. See bootdev command

bootflow

Allows scanning one or more bootdevs for bootflows, listing available bootflows, selecting one, obtaining information about it and booting it. See bootflow command

bootmeth

Allow listing of available bootmethds and setting the order in which they are tried. See bootmeth command

Bootflow states

Here is a list of states that a bootflow can be in:

State

Meaning

base

Starting-out state, indicates that no media/partition was found. For an SD card socket it may indicate that the card is not inserted.

media

Media was found (e.g. SD card is inserted) but no partition information was found. It might lack a partition table or have a read error.

part

Partition was found but a filesystem could not be read. This could be because the partition does not hold a filesystem or the filesystem is very corrupted.

fs

Filesystem was found but the file could not be read. It could be missing or in the wrong subdirectory.

file

File was found and its size detected, but it could not be read. This could indicate filesystem corruption.

ready

File was loaded and is ready for use. In this state the bootflow is ready to be booted.

Theory of operation

This describes how standard boot progresses through to booting an operating system.

To start. all the necessary devices must be bound, including bootstd, which provides the top-level struct bootstd_priv containing optional configuration information. The bootstd device is also holds the various lists used while scanning. This step is normally handled automatically by driver model, as described in Automatic Devices.

Bootdevs are also required, to provide access to the media to use. These are not useful by themselves: bootmeths are needed to provide the means of scanning those bootdevs. So, all up, we need a single bootstd device, one or more bootdev devices and one or more bootmeth devices.

Once these are ready, typically a bootflow scan command is issued. This kicks of the iteration process, which involves looking through the bootdevs and their partitions one by one to find bootflows.

Iteration is kicked off using bootflow_scan_first(), which calls bootflow_scan_bootdev().

The iterator is set up with bootflow_iter_init(). This simply creates an empty one with the given flags. Flags are used to control whether each iteration is displayed, whether to return iterations even if they did not result in a valid bootflow, whether to iterate through just a single bootdev, etc.

Then the ordering of bootdevs is determined, by bootdev_setup_iter_order(). By default, the bootdevs are used in the order specified by the boot_targets environment variable (e.g. “mmc2 mmc0 usb”). If that is missing then their sequence order is used, as determined by the /aliases node, or failing that their order in the devicetree. For BOOTSTD_FULL, if there is a bootdev-order property in the bootstd node, then this is used as a final fallback. In any case, the iterator ends up with a dev_order array containing the bootdevs that are going to be used, with num_devs set to the number of bootdevs and cur_dev starting at 0.

Next, the ordering of bootmeths is determined, by bootmeth_setup_iter_order(). By default the ordering is again by sequence number, i.e. the /aliases node, or failing that the order in the devicetree. But the bootmeth order command or bootmeths environment variable can be used to set up an ordering. If that has been done, the ordering is in struct bootstd_priv, so that ordering is simply copied into the iterator. Either way, the method_order array it set up, along with num_methods.

Note that global bootmeths are always put at the end of the ordering. If any are present, cur_method is set to the first one, so that global bootmeths are done first. Once all have been used, these bootmeths are dropped from the iteration. When there are no global bootmeths, cur_method is set to 0.

At this point the iterator is ready to use, with the first bootdev and bootmeth selected. Most of the other fields are 0. This means that the current partition is 0, which is taken to mean the whole device, since partition numbers start at 1. It also means that max_part is 0, i.e. the maximum partition number we know about is 0, meaning that, as far as we know, there is no partition table on this bootdev.

With the iterator ready, bootflow_scan_bootdev() checks whether the current settings produce a valid bootflow. This is handled by bootflow_check(), which either returns 0 (if it got something) or an error if not (more on that later). If the BOOTFLOWF_ALL iterator flag is set, even errors are returned as incomplete bootflows, but normally an error results in moving onto the next iteration.

Note that bootflow_check() handles global bootmeths explicitly, but calling bootmeth_get_bootflow() on each one. The doing_global flag indicates when the iterator is in that state.

The bootflow_scan_next() function handles moving onto the next iteration and checking it. In fact it sits in a loop doing that repeatedly until it finds something it wants to return.

The actual ‘moving on’ part is implemented in iter_incr(). This is a very simple function. It increments the first counter. If that hits its maximum, it sets it to zero and increments the second counter. You can think of all the counters together as a number with three digits which increment in order, with the least-sigificant digit on the right, counting like this:

bootdev

part

method

0

0

0

0

0

1

0

0

2

0

1

0

0

1

1

0

1

2

1

0

0

1

0

1

The maximum value for method is num_methods - 1 so when it exceeds that, it goes back to 0 and the next part is considered. The maximum value for that is max_part, which is initially zero for all bootdevs. If we find a partition table on that bootdev, max_part can be updated during the iteration to a higher value - see bootdev_find_in_blk() for that, described later. If that exceeds its maximum, then the next bootdev is used. In this way, iter_incr() works its way through all possibilities, moving forward one each time it is called.

Note that global bootmeths introduce a subtlety into the above description. When doing_global is true, the iteration takes place only among the bootmeths, i.e. the last column above. The global bootmeths are at the end of the list. Assuming that they are entries 3 and 4 in the list, the iteration then looks like this:

bootdev

part

method

notes

.

.

3

doing_global = true, method_count = 5

.

.

4

0

0

0

doing_global = false, method_count = 3

0

0

1

0

0

2

0

1

0

0

1

1

0

1

2

1

0

0

1

0

1

The changeover of the value of doing_global from true to false is handled in iter_incr() as well.

There is no expectation that iteration will actually finish. Quite often a valid bootflow is found early on. With bootflow scan -b, that causes the bootflow to be immediately booted. Assuming it is successful, the iteration never completes.

Also note that the iterator hold the current combination being considered. So when iter_incr() is called, it increments to the next one and returns it, the new current combination.

Note also the err field in struct bootflow_iter. This is normally 0 and has thus has no effect on iter_inc(). But if it is non-zero, signalling an error, it indicates to the iterator what it should do when called. It can force moving to the next partition, or bootdev, for example. The special values BF_NO_MORE_PARTS and BF_NO_MORE_DEVICES handle this. When iter_incr sees BF_NO_MORE_PARTS it knows that it should immediately move to the next bootdev. When it sees BF_NO_MORE_DEVICES it knows that there is nothing more it can do so it should immediately return. The caller of iter_incr() is responsible for updating the err field, based on the return value it sees.

The above describes the iteration process at a high level. It is basically a very simple increment function with a checker called bootflow_check() that checks the result of each iteration generated, to determine whether it can produce a bootflow.

So what happens inside of bootflow_check()? It simply calls the uclass method bootdev_get_bootflow() to ask the bootdev to return a bootflow. It passes the iterator to the bootdev method, so that function knows what we are talking about. At first, the bootflow is set up in the state BOOTFLOWST_BASE, with just the method and dev intiialised. But the bootdev may fill in more, e.g. updating the state, depending on what it finds. For global bootmeths the bootmeth_get_bootflow() function is called instead of bootdev_get_bootflow().

Based on what the bootdev or bootmeth responds with, bootflow_check() either returns a valid bootflow, or a partial one with an error. A partial bootflow is one that has some fields set up, but did not reach the BOOTFLOWST_READY state. As noted before, if the BOOTFLOWF_ALL iterator flag is set, then all bootflows are returned, even partial ones. This can help with debugging.

So at this point you can see that total control over whether a bootflow can be generated from a particular iteration, or not, rests with the bootdev (or global bootmeth). Each one can adopt its own approach.

Going down a level, what does the bootdev do in its get_bootflow() method? Let us consider the MMC bootdev. In that case the call to bootdev_get_bootflow() ends up in mmc_get_bootflow(). It locates the parent device of the bootdev, i.e. the UCLASS_MMC device itself, then finds the block device associated with it. It then calls the helper function bootdev_find_in_blk() to do all the work. This is common with just about any bootdev that is based on a media device.

The bootdev_find_in_blk() helper is implemented in the bootdev uclass. It names the bootflow and copies the partition number in from the iterator. Then it calls the bootmeth device to check if it can support this device. This is important since some bootmeths only work with network devices, for example. If that check fails, it stops.

Assuming the bootmeth is happy, or at least indicates that it is willing to try (by returning 0 from its check() method), the next step is to try the partition. If that works it tries to detect a file system. If that works then it calls the bootmeth device once more, this time to read the bootflow.

Note: At present a filesystem is needed for the bootmeth to be called on block devices, simply because we don’t have any examples where this is not the case. This feature can be added as needed.

If we take the example of the bootmeth_distro driver, this call ends up at distro_read_bootflow(). It has the filesystem ready, so tries various filenames to try to find the extlinux.conf file, reading it if possible. If all goes well the bootflow ends up in the BOOTFLOWST_READY state.

At this point, we fall back from the bootmeth driver, to bootdev_find_in_blk(), then back to mmc_get_bootflow(), then to bootdev_get_bootflow(), then to bootflow_check() and finally to its caller, either bootflow_scan_bootdev() or bootflow_scan_next(). In either case, the bootflow is returned as the result of this iteration, assuming it made it to the BOOTFLOWST_READY state.

That is the basic operation of scanning for bootflows. The process of booting a bootflow is handled by the bootmeth driver for that bootflow. In the case of distro boot, this parses and processes the extlinux.conf file that was read. See distro_boot() for how that works. The processing may involve reading additional files, which is handled by the read_file() method, which is distro_read_file() in this case. All bootmethds should support reading files, since the bootflow is typically only the basic instructions and does not include the operating system itself, ramdisk, device tree, etc.

The vast majority of the bootstd code is concerned with iterating through partitions on bootdevs and using bootmethds to find bootflows.

How about bootdevs which are not block devices? They are handled by the same methods as above, but with a different implementation. For example, the bootmeth for PXE boot (over a network) uses tftp to read files rather than fs_read(). But other than that it is very similar.

Tests

Tests are located in test/boot and cover the core functionality as well as the commands. All tests use sandbox so can be run on a standard Linux computer and in U-Boot’s CI.

For testing, a DOS-formatted disk image is used with a single FAT partition on it. This is created in setup_bootflow_image(), with a canned one from the source tree used if it cannot be created (e.g. in CI).

Bootflow internals

The bootstd device holds a linked list of scanned bootflows as well as the currently selected bootdev and bootflow (for use by commands). This is in struct bootstd_priv.

Each bootdev device has its own struct bootdev_uc_plat which holds a list of scanned bootflows just for that device.

The bootflow itself is documented in bootflow_h. It includes various bits of information about the bootflow and a buffer to hold the file.

Future

Apart from the to-do items below, different types of bootflow files may be implemented in future, e.g. Chromium OS support which is currently only available as a script in chromebook_coral.

To do

Some things that need to be done to completely replace the distro-boot scripts:

  • add bootdev drivers for dhcp, sata, scsi, ide, virtio

  • PXE boot for EFI

  • support for loading U-Boot scripts

Other ideas:

  • bootflow prep to load everything preparing for boot, so that bootflow boot can just do the boot.

  • automatically load kernel, FDT, etc. to suitable addresses so the board does not need to specify things like pxefile_addr_r