The Linux Foundation’s open source Zephyr Project received considerable attention at this February’s Embedded Linux Conference (ELC). Although there are still no shipping products running this lightweight real-time operating system (RTOS) for microcontrollers, Fabien Parent and Neil Armstrong of the French embedded firm BayLibre shared their experiences in developing a wearable device that may end up being the first Zephyr-based consumer electronics product.
BayLibre’s device has an ARM Cortex-A SoC connected via an SPI bus to a Cortex-M4 STM32L4xx. This is linked via I2C to other, more lightweight Cortex-M cores. Parent and Armstrong could say no more about the design, but they explained why they chose Zephyr and discussed the project’s pros and cons.
Parent and Armstrong needed a free, permissively licensed RTOS for a small-footprint wearable device, and they required drivers for UART, I2C master, and SPI slave. They also needed features like a scheduler, timers, tasks, threads, and locks. The list was quickly narrowed down to the Apache 2.0 licensed Zephyr, the 3-clause BSD licensed NuttX, or rolling an OS of their own. After having already committed to Zephyr, Apache myNewt launched, and they realized this might have worked, as well.
Parent and Armstrong first considered the DIY approach. “Developing our own OS had the advantage of being fun,” said Armstrong. “It could be tailored to our needs and our development process, and we would better understand the entire code base. The drawback is that it takes time, and there is no community to help. It would be hard to maintain, and there would be little time to mature and fix the bugs.”
With BayLibre’s customer deadline essentially negating the homegrown option, the developers looked into NuttX, which had the advantage of being around longer than Zephyr. Although Parent and Armstrong were embedded Linux developers and fairly new to RTOSes, Parent had become familiar with NuttX from working for two years at Google’s recently abandoned Project Ara. NuttX is best known for running on Pixhawk drone controllers.
“NuttX had the advantage of being familiar, and it already supported our STM32L4xx SoC,” said Parent. “But the build system is completely unreliable. At Project Ara, whenever we changed the configuration, we could not be sure it would work. Also, there’s no real NuttX community — it’s basically one guy who wrote almost everything, and there is basically no peer review.” Finally, despite NuttX’s BSD license, “inside its repository there is a lot of code with licenses such as GPL, so there’s a chance you might accidentally include some, which is scary,” added Parent.
Zephyr pros and cons
Zephyr had only been announced a few weeks before they began the project, yet it already had several appealing features. “It’s much like Linux in the coding style, build system, and the concept of maintainers,” said Armstrong. “Zephyr also has great documentation, and they are quickly growing a strong community. Zephyr supports low memory usage, and it’s highly configurable and modular. It offers modern cooperative and preemptive threading, and will eventually add security pre-certification.”
At the time, Zephyr’s biggest drawback was its immaturity, and the fact that it did not support the STM32L4xx SoC, only an older STM32F1xx model. The latter turned out to be a much easier challenge than they had imagined. The SoCs turned out to be very similar, so updating the port took only a day and a half, with testing finished within a week. “Most of the time was spent on I2C and SPI, and debugging a stupid register issue,” said Armstrong.
The challenges came with the upstreaming process itself, and the fact that Zephyr was changing so quickly. “We made the bad choice of waiting a month before upstreaming the code,” said Parent. “When we did the first rebase, nothing worked, and we had to rewrite the power management code three times. As soon as you have clean code, try to upstream it. Otherwise, you will spend hours rebasing everything.”
The upstream patch review process, which is now undergoing revision, was also more cumbersome compared to Linux. “Zephyr uses Gerrit for patch review, and JIRA for the feature requests, and there’s also a mailing list,” said Parent. “Sometimes you don’t know where to look for answers.”
Gerrit makes it easy to not forget patches, but “it’s really slow, and is very complicated,” said Parent. “One of the biggest issues is that you have to individually select the reviewers instead of broadcasting. There is no concept of patch series, so you have to add topics to your patch series, which makes sending patches more complicated. Its archive search is really bad, and it’s really hard to get a full view of a patch.”
JIRA also posed some challenges. “JIRA is manager friendly and makes it easy to do graphs, but it’s not developer friendly, and there’s no good information on how to use it,” said Parent. “It’s yet another communication medium that is overlapping with mailing lists and Gerrit.”
A HAL of a surprise
Parent and Armstrong uploaded the ST port patches to Gerrit and waited for reviews. There was no response, but they kept pinging the maintainer on IRC. They waited almost a month for a review response, and when it came it was rather vague.
They also received a discouraging note from a Zephyr developer from a large corporation. “He said please stop your work because we want to push our own HAL to Zephyr based on the STM32 Cube SDK,” related Armstrong. “He said that after he did his proposal we could redo our patch.”
They were surprised about the acceptance of HAL (Hardware Abstraction Layer) technology. “Our patch was fully rewritten in native code with no external links to anything,” said Armstrong. “We were used to the Linux kernel, where you can only have native, maintainable code. And the maintainers never told us from the start about HALs.”
“There was a discussion on the Zephyr mailing list as to whether we should use HALs before moving to native code,” added Parent. “Input was requested from the maintainers, but there was no reply. Right now, most of the Zephyr maintainers are from SoC companies. The result is that vendor HALs are slowly replacing native drivers, or at least for ST. Personally, I would love to not have HALs.”
Parent noted that the Linux kernel project prefers that their top-level maintainers do not work for SoC companies. He asked Linux DRM maintainer Dave Airlie about the situation, and Airlie was quoted as saying: “The reason the top-level maintainer (me) doesn’t work for Intel or AMD or any vendors it that I can say NO when your maintainers can’t or won’t say it.”
Parent also suggested that the Zephyr Project is not as transparent as some other open source projects. The technical leadership is determined by voting members of the Zephyr Technical Steering Committee (TSC). The TSC is open to community members to participate, but you must be invited to attend the meetings.
“Most meeting minutes require permission to access, and it can take up to two weeks,” he said. “Decisions are spread across JIRA, Gerrit, and the mailing lists, and blog posts are controlled through a separate committee, which makes it kind of hard to post a blog.”
There are also challenges in working with a new project driven by “top-down development,” said Parent. The priorities appear to be planned features like a Unified Kernel, a new IP stack, and Thread protocol support, he added. “They need to clarify their priorities and let us know if planned features have priority vs. community contributions.”
In conclusion, Armstrong summed up their first Zephyr experience. “We don’t like the HALs, and the review tools made us really sad,” he said. “The code is still really young and the APIs change fast, so you need to test your code for every release to see if it’s still working.”
Yet, Armstrong also emphasized Zephyr’s advantages, not least of which is the fact that it’s one of the few open source RTOSes optimized for wearables. “Zephyr is a good design for low memory or low performance on small CPUs,” said Armstrong. “It’s really similar to Linux, and the APIs are simple and well documented. There’s a real and active community, and the flaws are getting fixed very quickly.”
Armstrong also noted a possible improvement on the reviews front: “There was a rumor this morning that Zephyr is moving from Gerrit to GitHub,” he said. “It’s not perfect. but it’s better than Gerrit for sure.”
Other Zephyr sessions from ELC 2017 now available on YouTube include:
Intel’s Anas Nashif summarizes Zephyr’s progress, as well as plans for next year.
Linaro’s Andy Gross talks about plans to integrate device tree in Zephyr.
Intel’s Marcel Holtmann discusses using Zephyr on the BBC micro:bit board.
Intel’s Sakari Poussa explains how to jumpstart Zephyr development by using JavaScript Runtime for Zephyr, including a “shell” developer mode and Web USB.
ARM’s Vincenzo Frascino, who works on the Linaro LITE group, describes how Zephyr runs on the ARM Beetle test-chip implementation of the IoT subsystem for Cortex-M processors.
Intel’s Johan Hedberg discusses Zephyr’s Bluetooth support, including its IPv6/6LoWPAN stack for implementing IPv6 over BLE and the emerging Bluetooth Mesh.
You can watch the full “War Story” video on Zephyr development below:
Connect with the Linux community at Open Source Summit North America on September 11-13. Linux.com readers can register now with the discount code, LINUXRD5, for 5% off the all-access attendee registration price. Register now to save over $300!