Device tree overlays [LWN.net]

This article brought to you by LWN subscribers
Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

By Jonathan Corbet
October 22, 2014

LinuxCon Europe

Pantelis Antoniou started his LinuxCon Europe session on device tree overlays by noting that the device tree concept often draws complaints — frequently of the inflammatory variety. Those complaints did not prevent the room from filling up to capacity, though — it would have been standing room only except that the on-site German fire marshals took their job seriously and would not allow standing in the sessions. Device trees as currently implemented in the kernel, Pantelis said, are also not up to the task of describing current hardware. Work done by him and others should rectify that situation in the near future, though.

He started with an overview of the device tree concept: a device tree is essentially a text file that describes the hardware to the kernel. Since many architectures do not have self-describing hardware, some sort of externally supplied description is needed for the kernel to understand the system it is running on; device trees are the solution of choice in the Linux world. But, like most technologies, device trees have their shortcomings. The device tree language is another thing that software and hardware developers have to learn; to make things worse, it is a cryptic language that presents a lot of complexity to beginners. The fact that the current device tree compiler performs no syntax checks does not help the situation; the first indication of an incorrect device tree file is typically a failure to boot. Being purely data-driven, device tree files cannot contain any imperative logic. And so on.

But the worst problem, according to Pantelis, is that the static nature of device trees makes them incapable of describing contemporary hardware. It is not always possible to know what the hardware will look like prior to booting the system, but device trees are set in stone at boot time. For a self-contained system like a phone handset, the static nature of device trees is not a big problem. But consider hardware like the BeagleBone, which can have any of a number of add-on "cape" boards that augment the hardware. Creating a device tree file for every combination of boards and capes is not a viable solution. Assembling a device tree in the bootloader is possible but difficult, and it falls apart when faced with multiple capes stacked onto a single system. It would be far better to be able to piece together, at boot time or afterward, separate device tree fragments representing the board and the cape(s), ending up with a description of the full system.

This problem comes up in other settings as well. The Raspberry Pi supports "hats" for the addition of hardware. Hardware built around a field-programmable gate array (FPGA) can vary wildly in nature depending on the firmware loaded into the array; such hardware cannot possibly be supported by a static device tree. Hardware, Pantelis said, is software now. But Linux makes dealing with the new hardware unnecessarily complex, driving hardware hackers to simpler (but far less capable) systems like the Arduino.

The first attempt to solve the problem (in the BeagleBone context) was a subsystem called "capebus." But this proposal did not last long once reviewers got a look at it. It was modeling the cape problem around a bus abstraction, but capes do not sit on a bus. So another approach was indicated; in the end, it was decided that dynamically altering the system's device tree to reflect the actual hardware was the right solution to the problem.

A piece of the solution has been in the kernel for some time; it is controlled by the CONFIG_OF_DYNAMIC configuration option. It allows run-time modification of the device tree, but it is only used by the PowerPC architecture. Editing of the tree is destructive, meaning that changes cannot be reverted later; that is problematic for hardware that can be hot-removed from a running system. Changes are also not performed in an atomic manner. There is no connection to the device model code, so users must make any system topology changes independently. In short, it is a piece of the puzzle, but it is far from a complete solution.

The first step toward that complete solution, Pantelis said, is to rework the dynamic device tree code. Some control files have been moved from /proc to /sys. Nodes in the device tree are now proper kobjects, so they have lifecycle management built into them. Some changes to better define the semantics of the reconfiguration notifiers have been made. This work was all merged into the 3.17 kernel.

The second step is "the meat of the problem," according to Pantelis. It is often necessary for one part of a device tree to refer to another part; a camera sensor description, for example, may include a pointer to the I2C bus that carries the sensor's control channel. These references are called "phandles"; they are symbolic within the human-readable device tree, but converted to simple integer values by the device tree compiler. Pantelis had to extend the compiler to keep track of all phandles used; when requested (with the arguably strange "-@" command-line option), the compiler will store a sort of symbol table in the root of the compiled device tree with the list of all phandles in the tree.

This mechanism allows the loading of a device tree fragment into the system's current device tree. The new fragment will contain references to phandles in the main tree; the new in-kernel resolver code will fix up those references to match the real phandles in that tree. The resolver will also relocate all of the phandles in the new fragment to ensure that they are unique within the device tree as a whole and adjust any internal references accordingly.

Step three is to add the concept of device tree changesets to the kernel. A call to of_changeset_init() starts the addition of a changeset; then new device tree pieces can be added with of_changeset_attach_node(). Once the pieces are in place, it's a matter of locking the device tree and calling of_changeset_apply(). If the change needs to be reverted in the future (perhaps the hardware in question has been hot-unplugged from the system), of_changeset_revert() will put things back as they were before.

With this infrastructure in place, device tree overlays can be supported. An overlay can add nodes to the tree, but it can also make changes to properties in the existing tree. In the simplest case, an overlay might just change a device node's status from "disabled" to "enabled." This feature is useful for hardware hackers, Pantelis said; hardware presence can be turned on or off easily with no need to reboot the system or to dig into C code.

The resolver code was merged into the 3.18 kernel; full overlay support should come soon. In the future, there is an overlay-based FPGA manager in the works, along with a BeagleBone cape manager. There is also interest in using this feature to support multiple versions of a given board from a single device tree. The end result of all this work is that device trees have become more dynamic — and more capable — than they were when the kernel first started using them.

[Your editor would like to thank the Linux Foundation for supporting his travel to LinuxCon Europe.]

Index entries for this article
Kernel	Device tree
Conference	LinuxCon Europe/2014

to post comments

Device tree overlays

Posted Oct 23, 2014 4:43 UTC (Thu) by dlang (guest, #313) [Link] (11 responses)

I'm missing something here. Device trees are needed because the system can't detect hardware, so the kernel needs to be told what hardware exists.

But this proposal is about having the kernel detect hardware and change the device tree.

Why does the device tree need to be involved if the kernel can detect the hardware?

Device tree overlays

Posted Oct 23, 2014 7:19 UTC (Thu) by bnorris (subscriber, #92090) [Link]

> Why does the device tree need to be involved if the kernel can detect the hardware?

I don't think the kernel can detect the hardware in the examples mentioned in the article. From the sound of it, dynamic patching of the device tree is controlled by the user in some form, either via sysfs or procfs. But admittedly, it's difficult to tell, since this article doesn't actually link to the relevant work.

[Preemptive edit:] It looks like this contains some of the work, although it surely can't be the latest:

http://lwn.net/Articles/600466/

In this work, overlays are provided by the user via configfs.

Device tree overlays

Posted Oct 23, 2014 7:20 UTC (Thu) by koenkooi (subscriber, #71861) [Link] (1 responses)

One of the use-cases is plugging in expansion boards (e.g. capes for the beaglebone, hats for the rpi) where the kernel can detect and read the EEPROM, but not the actual hardware. Using information from the EEPROM you can load a devicetree overlay for the new hardware. Maybe in the future you'll be able to put the overlay directly into the EEPROM.

Device tree overlays

Posted Oct 25, 2014 8:01 UTC (Sat) by HIGHGuY (subscriber, #62277) [Link]

In the absense of the device-tree overlay patchset in our 'old' 2.6.32 kernel we had to refrain to a close-enough hack.

Our device-tree contains all overlays and each node that is 'optional' contains an extra field indicating the overlays it is active in.
The bootloader then reads an EEPROM and fills in a /chosen entry.

The I2C subsystem and one custom driver were modified to compare the extra node field against the /chosen field and only probe the device when they match.

It's not perfect, but gets the job done sufficiently for the few plugin cards we expect in the future.

Device tree overlays

Posted Oct 23, 2014 17:07 UTC (Thu) by pantoniou (subscriber, #85257) [Link] (7 responses)

The kernel cannot detect the full set of hardware devices. It can however deduce what's there by reading an out-of-band marker that describes.

On the beaglebone you have an EEPROM, on other platforms you might encode the possible expansion board combination on GPIOs and so on.

For FPGAs you just have a-priori knowledge of what a bitstream loaded to the FPGA is providing.

Device tree overlays

Posted Oct 27, 2014 1:21 UTC (Mon) by giraffedata (guest, #1954) [Link] (6 responses)

I'm still not getting it. Where is this Beaglebone EEPROM and what information is in it? And if it doesn't have all the information a Linux kernel needs to drive a device, so a device tree would not be necessary, why doesn't it?

Device tree overlays

Posted Oct 27, 2014 8:00 UTC (Mon) by rvfh (guest, #31018) [Link] (5 responses)

The EEPROM is non-volatile memory and will only contain information about what's present (e.g. which cape), whereas the device tree contains information about how to use it (e.g. the I²C bus address of the cape's chip).

They are complementary.

What is more, the user could also hard-code in a script which device tree overlay must be loaded, because she knowns what cape the board is equipped with.

Device tree overlays

Posted Oct 27, 2014 8:11 UTC (Mon) by dlang (guest, #313) [Link]

so where are these snippets of devicetree data coming from? are they being compiled into the kernel, loaded from userspace somehow, or what?

without device tree snippets, you had drivers that looked for an id on some bus and then knew what devices were where when they saw that ID (hard coded in the driver), it sounds like this is trying to avoid the hard-coding into a driver, but it's not clear where the data is arriving from.

It's also not clear why the info about a particular device (cape or whatever) needs to be merged with the devicetree that was handed to the kernel at boot time. Why aren't the appropriate devices just configured and initialized without going to the trouble of merging the info about this cape into the tree that you were given at initialization time.

If this was being done in the bootloader, with the resulting devicetree being handed to the kernel, it would make perfect sense. But it seems to make far less sense after the kernel has initialized the system from the devicetree it was handed and is now running normally.

Device tree overlays

Posted Oct 27, 2014 17:20 UTC (Mon) by giraffedata (guest, #1954) [Link] (3 responses)

The EEPROM is non-volatile memory and will only contain information about what's present (e.g. which cape), whereas the device tree contains information about how to use it (e.g. the I²C bus address of the cape's chip).

Where is the EEPROM and who programs it? Who chooses the I²C bus address of the cape's chip?

Does "which cape" mean which model of cape? If so, it sounds a lot like PCI vendor and device ID, which seems like all Linux needs to know to know how to use the cape (assuming the requisite knowledge were programmed in, as for most PCI devices).

Is there some reason the EEPROM can't contain information about how to use a cape?

What is more, the user could also hard-code in a script which device tree overlay must be loaded, because she knowns what cape the board is equipped with.

But I'm sure she'd rather not, which is why I'm trying to understand what is different about the world in which Beaglebone lives that Linux can't detect what cape the board is equipped with like it does in the bigger systems.

Device tree overlays

Posted Oct 28, 2014 9:43 UTC (Tue) by jem (subscriber, #24231) [Link]

The EEPROM is on the cape and is pre-programmed by the maker of the cape. The I2C address of the EEPROM is one of four fixed addresses; the cape should also include a two position switch to select which of the four addresses it responds to. If several capes are stacked (up to four), the user must take care the switches are in different positions on each cape.

The EEPROM contents are defined in the BeagleBone Black System Reference Manual, and contain information like Manufacturer Name, Part Number, Pin Usage, etc.

Device tree overlays

Posted Feb 16, 2015 20:52 UTC (Mon) by Kamilion (subscriber, #42576) [Link] (1 responses)

Have a look at this, It's a pretty good example of what can be done with a beaglebone cape (as well as being generally useful to a lot of folks doing EE)

https://github.com/abhishek-kakkar/BeagleLogic/wiki/The-B...

Very simple design, two ICs and some headers on a PCB from dirtypcbs.com.
There's just some unpopulated shunt-resistors for the I2C EEPROM, as it assumes it will be the lowest board in the stack. Not surprising, considering it's set up to be a 100Mhz multichannel logic analyzer.
Software side of it has something like 20+ common protocol decoders.

The TI 74LVCH16T245 16-bit buffer chip doesn't really have any means of identification, as it's just a logic part. And yet the I2C EEPROM still says "I got here first" on the pins it's contents requests, for the capes that may be stacked on top. Maybe a cellular module, maybe a 30Mhz SPI display... If you've got a raspberry pi or a odroid C1 or something right now and try to stick one of the cheap 30Mhz SPI displays floating around out there right now, you still need to track down the right kernel module, tell it which GPIOs go where when you modprobe stuff.

modprobe spicc
modprobe fbtft_device name=odroidc_tft32 rotate=270 gpios=reset:116,dc:115 speed=32000000 cs=0

Kinda squicky; we can do better. Loading Devicetree fragments is at least a signpost along the path.

And yes, the loaded i2c rom can potentially be very large (256K+) and contain just about anything after the relatively small header the beaglebone code will attempt to parse, which is about 256 bytes.

Device tree overlays

Posted Feb 17, 2015 11:12 UTC (Tue) by etienne (guest, #25256) [Link]

Yes, but one question is: is that board needed to boot?
If not, then is the pre-initialisation system the right place to do that kind of things?
How is it different than some equipment plugged in a serial port, like a modem, that you initialize in user mode when you need it?
I understand that having a single place to describe the hardware is nice, but modifying that place (device tree) may trigger a bug which stops the system booting - and so make it more difficult to recover.
You may want to "insert" a device tree overlay after boot in user mode, but then isn't it complexifying the system compared to inserting a module with some parameters.

Device tree overlays

Posted Oct 23, 2014 11:07 UTC (Thu) by jezuch (subscriber, #52988) [Link] (1 responses)

> The fact that the current device tree compiler performs no syntax checks does not help the situation; the first indication of an incorrect device tree file is typically a failure to boot.

So, um, what does it do if the file is invalid?

Device tree overlays

Posted Oct 23, 2014 17:09 UTC (Thu) by pantoniou (subscriber, #85257) [Link]

There are two levels of invalid file.

1) The file has syntax errors; this means the file won't compile - that's easy.

2) The file's syntax is correct but is not a valid description of the hardware. That's the hard part. I.e. a typo in one of the properties names is a valid DT file, but it's not going to work when you use it.

Device tree overlays

Posted Mar 28, 2015 11:58 UTC (Sat) by matwey (guest, #101693) [Link]

Do I understand correctly, that currently there is still no way to modify device tree from user space in 4.0?