Hacking the Linux Kernel in Ada – Part 1

17615

For this three part series, we implemented a ‘pedal to the metal’ GPIO driven, flashing of a LED, in the context of a Linux kernel module for the NVIDIA Jetson Nano development board (kernel-based v4.9.294, arm64) in my favorite programming language … Ada!

You can find the whole project published at https://github.com/ohenley/adacore_jetson. It is known to build and run properly. All instructions to be up and running in 5 minutes are included in the accompanying front-facing README.md. Do not hesitate to fill a Github issue if you find any problem.

Disclaimer: This text is meant to appeal to both Ada and non-Ada coders. Therefore I try to strike a balance between code story simplicity, didactic tractability, and features density. As I said to a colleague, this is the text I would have liked to cross before starting this experiment.

What’s in the Ada name?

Delightfully said by Rod Chapman in his great SPARKNaCL presentation https://blog.adacore.com/sparknacl-two-years-of-optimizing-crypto-code-in-spark-and-counting, the Ada programming language is “Pascal on steroids”. Though, I would argue that the drug is healthy. Thriving on strong typing and packaging, Ada has excellent modelization scalability yet remains on par with C performances. 

It compiles native object code using a GCC front-end or LLVM, respectively called GNAT and GNAT-LLVM.  This leads us to an important reminder: Ada, at least through GNAT, has an application binary interface (ABI) compatible with C on Linux.

Linux driver experiment in Ada, but why?

  • First clearing the technical plumbing in Ada facilitates moving to SPARK/Ada where significant value can be added by improving drivers implementation using contracts, pointer ownership, advanced static analysis and associated proof technologies.
  • Share Ada’s bare-metal capability, expressivity, and productivity as a system/embedded programming language.
  • Demonstrate that Ada sits well at the foundation of heterogeneous real-world technology stacks.

Note that as Ada code is not accepted in the upstream kernel sources and the Linux team made it clear it is not interested in providing a stable kernel API, writing Linux drivers shipped in Ada/SPARK mean you will have to adapt those drivers to all kernel version you are interested in, which is a task outside the scope of this document. For single kernel versions, proofs-of-concept, or organizations having enough firepower to maintain and curate their own drivers this is not an issue though.

Kernel module headfirst

Let’s discuss our overall driver structure from an orthodox C perspective. It will allow us to clear important know-how and gotchas. The following C kernel module (driver) implements a one-second delayed work queue repeatedly registering a callback writing “flip_led_state” to the kernel message buffer. Please, note the usage of the preprocessing macros.

/* printk_wq.c */

#include <linux/module.h>
#include <linux/workqueue.h>
#include <linux/timer.h>

void delayed_work_cb(struct work_struct* work);
struct workqueue_struct* wq = 0;
DECLARE_DELAYED_WORK(dw, delayed_work_cb);   /* heavy lifting 1. */

void delayed_work_cb(struct work_struct* work)
{
   printk("flip_led_state\n");
   queue_delayed_work(wq, &dw, msecs_to_jiffies(1000));
}

int init_module(void)
{
   if (!wq)
       wq = create_singlethread_workqueue("my_wq"); /* heavy lifting 2. */
   if (wq)
       queue_delayed_work(wq, &dw, msecs_to_jiffies(1000));
   return 0;
}

void cleanup_module(void)
{
   if (wq){
       cancel_delayed_work(&dw);
       flush_workqueue(wq);
       destroy_workqueue(wq);
   }
}

MODULE_LICENSE("GPL");

When building a kernel module on Linux the produced Executable and Linkable Format (ELF)  object code file bears the *.ko extension. If we inspect the content of the working printk_wq.ko kernel module we can sketch the gist of binding to kernel module programming.

$ nm printk_wq.ko
...
                 U __alloc_workqueue_key
                 U cancel_delayed_work
00000000000000a0 T cleanup_module
0000000000000000 T delayed_work_cb
                 U delayed_work_timer_fn
                 U destroy_workqueue
0000000000000000 D dw
                 U flush_workqueue
0000000000000000 T init_module
                 U _mcount
0000000000000028 r __module_depends
                 U printk
                 U queue_delayed_work_on
0000000000000000 D __this_module
0000000000000000 r __UNIQUE_ID_license45
0000000000000031 r __UNIQUE_ID_vermagic44
0000000000000000 r ____versions

First, we recognize function/procedure names used in our source code eg. cancel_delayed_work. We also find that they are undefined (U). It is important to realize that those are the kernel’s source symbols and their object code will be resolved dynamically at driver/module load time. Correspondingly, all those undefined signatures can be found somewhere in the kernel source code headers of your target platform.

Second, we are missing some methods we explicitly called, eg.  create_singlethread_workqueue, in the symbol table.  This is because they are not, in fact, functions/procedures but convenience macros that expand to concrete implementations named differently and potentially have a different signature altogether or static inline not visible outside. For example, laying out the create_singlethread_workqueue explicit macro expansion from the Linux sources makes it clear. (Follow order, not preprocessor validation order)

/* https://elixir.bootlin.com/linux/v4.9.294/source/include/linux/workqueue.h*/

#define create_singlethread_workqueue(name)				\
	alloc_ordered_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, name)
...
#define alloc_ordered_workqueue(fmt, flags, args...)			\
	alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED |		\
			__WQ_ORDERED_EXPLICIT | (flags), 1, ##args)
...
#define alloc_workqueue(fmt, flags, max_active, args...)		\
	__alloc_workqueue_key((fmt), (flags), (max_active),		\
			      NULL, NULL, ##args)
...
extern struct workqueue_struct *
__alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
	struct lock_class_key *key, const char *lock_name, ...) __printf(1, 6);

Now everything makes sense. __alloc_workqueue_key is marked ‘extern’ and we find its signature in the printk_wq.ko symbol table. Note that we moved from create_singlethread_workqueue taking a single parameter to __alloc_workqueue_key taking more than five arguments. A logical conclusion, as deduced while following the explicit expansion up here, is that the arguments delta are all baked at the preprocessor stage. ‘Baking’ parameters using macros chaining offer polymorphism opportunities for kernel developers. Eg. compiling for arm64 may expand the macros differently than on RISC-V while offering both to retain a unified create_singlethread_workqueue(name) call for device driver developers to use; client of this ‘kernel API function’.

To get an Ada equivalent implementation of this driver I think of three choices when faced with a binding:

  • The signature you want to bind is extern, concrete, you bind directly by importing it.
  • You reconstruct around, only those undefined (U) “final” symbols until you reimplement the integrality of the functionality provided by the “top” macro. Useful when macros create stuff in place and you need to get a hold of it on the Ada side.
  • You write a concrete C function wrapping the macro and then bind by importing this wrapper function.

I will present an example of each in subsequent parts.

Platform driver and device driver

A Linux kernel module code structure is somewhat simple: you implement an init function, a deinit function. You have other requirements like supporting code reentry (eg. entering function may be called many times asynchronously) and you should not stall (eg. you do not run a game loop inside any kernel driver function). Optionally, if you are doing a platform (subsystem) driver, you need to register callbacks to polymorph a kernel interface of your choice. There is more to it, but you can get a long way just within this structure.

If you were to replace the shipped GPIO platform driver on your target machine, without breaking anything, your driver code would need to provide a concrete implementation of methods exposed in the linux/gpio/driver.h API. Below is some Tegra GPIO platform driver implementation code. If you start from the end, subsys_initcall(tegra_gpio_init), you should find that registering the driver sets a probe callback, in turn setting tegra_gpio_direction_output as the gpio_chip direction_output concrete code.

/* linux/gpio/driver.h */
struct gpio_chip {
	int	(*direction_output)(struct gpio_chip *chip, 
                                unsigned offset, int value);
}

/* drivers/gpio/gpio-tegra.c */
struct tegra_gpio_info {
	struct gpio_chip		gc;
};
static int tegra_gpio_direction_output(struct gpio_chip *chip, 
                                       unsigned offset, int value)
{
...
	return 0;
}
static int tegra_gpio_probe(struct platform_device *pdev)
{
	tgi->gc.direction_output = tegra_gpio_direction_output;
}
static struct platform_driver tegra_gpio_driver = {
	.probe		= tegra_gpio_probe,
};
static int __init tegra_gpio_init(void)
{
	return platform_driver_register(&tegra_gpio_driver);
}
subsys_initcall(tegra_gpio_init);

subsys_initcall is used to build statically linked module only and serves to implement platform driver. init_module can be used to init a built-in or loadable module but subsys_initcall is guaranteed to be executed before init_module. For this experiment we implemented a device driver making use of init_module.

To step into an Ada implementation we needed to concede by creating our driver entry point in C first.

  • The needed MODULE_LICENSE() expansion turned out to be hardly portable to Ada as it expands to some complex annotations scheme.
  • Kbuild, the Linux kernel build system, uses this ‘main’ C file to produce dependable meta information before/while building the .ko object.

From there we extern the ada_init_module and ada_cleanup_module function where we will pick up, fully Ada, to implement the delayed work queue structure seen previously and all consequent modeling of our flashing led driver.

/* main c */

#include <linux/module.h>

extern void ada_init_module (void);
extern void ada_cleanup_module (void);

int init_module(void)
{
   ada_init_module();
   return 0;
}

void cleanup_module(void)
{
   ada_cleanup_module();
}

MODULE_LICENSE("GPL");

The need for a restricted runtime

If you compile the following C code using your default Linux desktop compiler toolchain

STR="void main(){}" && echo -e $STR | gcc -o output.o -xc -

And inspect its symbol table

$ nm output.o 
...
0000000000400510 T __libc_csu_init
                 U __libc_start_main@GLIBC_2.2.5
00000000004004f6 T main
                 U printf@GLIBC_2.2.5
...

You find references to libc you did not explicitly ask for. You need the heads-up that those undefined (U) won’t be resolved at kernel module loading. A lot of libc implements stuff at the userspace level which is not compatible with the kernel operations so it is forbidden altogether.

Using system default GCC, make calling Kbuild using a special syntax, Kbuild will automatically strip those dependencies to libc for you to produce a valid kernel module (*.ko). But what happens when you link object code ‘compiled as usual’ from another rich and complex language like Ada to your kernel module? The object code will most certainly contain machinery from the language runtime, complex routines that end up tapping in libc, or other forbidden operations in the kernel context. This is where you need a constrained, reduced runtime for your language of choice.

What is cool with Ada though is that the GNAT infrastructure has runtime separation architectured to be swapped. Using AdaCore codebases, you can build your runtime by embarking on just what you want/need in it to link against. GNAT Ada runs on countless barebone platforms so the runtime granularity and dependency problems have already, most of the time, been handled for you. To initialize this runtime properly you are given sensible control on where and when to run some elaboration code; more on that later when we cover the Ada side of things.

For this experiment, we built a light aarch64-linux native runtime compatible to run in kernel space while retaining convenient aspects of the language, eg. the secondary stack. Using the https://github.com/AdaCore/bb-runtimes scripts, we augmented a new target aarch64-linux and built the runtime. Getting to know ‘how to do’ took longer, building it takes seconds. You can find and use this runtime in the experiment repository under rts-native-light/ when cross compiling using GNAT Pro. If you are building using the platform GNAT FSF the runtime is found under rts-native-zfp/?.

Kbuild integration

Kbuild is somewhat flexible so GNAT object code can be linked into the kernel driver without too much effort. As implied previously, make understands syntax to leverage and activate Kbuild, eg. to produce our driver called flash_led.ko, start from transient flash_led.o that depends on obj/bundle.o to build. Our module makefile uses this special syntax

obj-m := flash_led.o
flash_led-y := obj/bundle.o

You can ‘trick’ Kbuild/make by providing already existing .o files as long as you also provide dependable *.o.cmd intermediary files to Kbuild. We leverage such substitution by coordinating GPRbuild (GNAT build system), Kbuild/make, and touch using Python. There are two phases, generate and build.

Generate

$ python make.py generate config:flash_led_jetson_nano.json

1. Build, in the background, a bare minimum known to be valid main_template.c kernel driver and extract the compilation switches used by make/Kbuild to successfully produce this main_template.ko guinea pig module. There are around ~80 such GCC switches captured and used to generate the basic, valid *.ko for this kernel-based v4.9.294, arm64 platform. This ‘buried deep into Kbuild’ knowledge extraction turned out to be key in stabilizing the production of valid kernel object code. Note that this trick should work well for any platform because it extracts its specifics.

2. Generate the GPRbuild project file injecting those ~80 switches for the compilation of our project main.c along with all Ada source files using different project configuration data found in the JSON file.

3. Generate Makefile using knowledge of configuration data compliant with Kbuild syntax (cross compiler location, project name, etc found in the JSON file).

You can inspect the different templates and their substitution markers eg. <replace_me_token> by looking inside the template folder of the project repository.

Build

python make.py build config:flash_led_jetson_nano.json rts:true

1. Build the GNAT runtime library (RTS) libgnat.a by driving its runtime_build.gpr project file. (optional on subsequent passes)

2. Build our driver project standalone library libflash_led.a using the generated GPRbuild project file.

3. Link our custom RTS libgnat.a with our project libflash_led.a to a tidy bundle.o object.

4. Create missing *.o.cmd intermediary files to keep Kbuild happy. Remember we are swapping already built objects under its nose!

5. Finally, launch the makefile to cross-compile our flash_led.ko driver for the Jetson Nano aarch64 platform!

The Ada driver

For this experiment we did two implementations of Led.adb (body file, Ada equivalent of C .c source file), one at src/linux_interface/led.adb, the other under src/raw_io/led.adb. You specify which driver implementation you want to build by setting “module_flavor” value in flash_led_jetson_nano.json. make.py will inject the proper source paths in the project driver flash_led.gpr file during the generate phase.

The first version implementation of the LED interface binds to standard kernel API Gpio_Request, Gpio_Direction_Output, Gpio_Get_Value, and Gpio_Free functions exposed in include/linux/gpio.h. This is rather straightforward as the binding is mostly one-to-one to the C functions. In this linux_interface version, as soon as you bind, you end up executing the C concrete implementation of the shipped GPIO driver.

Circumventing most Linux machinery, the second raw_io version implementing the LED interface is more interesting as we control the GPIO directly by writing to IO memory registers. Akin to doing bare-metal, directly driving GPIOs is a matter of configuring some IO registers mapped in physical memory. Remember an OS serves the role of a hardware orchestrator and consequently acts as having implicit ownership over your hardware. To tap directly onto physical memory in a kernel context often requires some kind of red tape crossing.

Here Linux requires (strongly suggests?) you write/read to kernel mapped memory instead of directly to physical memory. First, you need to acquire the kernel-mapped physical address using the in/famous ioremap call. Using the mapped address we read and write to our GPIO registers using ioread32 and iowrite32 respectively. This is the only Linux machinery involved in this raw_io version. As you probably figure this is more a peek at what one would code inside a driver responsible to implement the concrete implementations of functions offered by something like include/linux/gpio.h. We will even end up writing assembly code from Ada to achieve pure rawness!

What‘s next?

I had to set the table to write Linux kernel modules in Ada by first talking about C, object code, Kbuild, constrained runtime, and overall build strategy. The streamlined fun begins as we cross the Ada fence. If I picked your curiosity and you are ready to dig Ada, meet me here. Cheers!

I want to thank Quentin Ochem, Nicolas Setton, Fabien Chouteau, Jerome Lambourg, Michael Frank, Derek Schacht, Arnaud Charlet, Pat Bernardi, Leo Germond, and Artium Nihamkin for their different insights and feedback to nail this experiment.


olivier henley
Olivier Henley

The author, Olivier Henley, is a UX Engineer at AdaCore. His role is exploring new markets through technical stories. Prior to joining AdaCore, Olivier was a consultant software engineer for Autodesk. Prior to that, Olivier worked on AAA game titles such as For Honor and Rainbow Six Siege in addition to many R&D gaming endeavors at Ubisoft Montreal. Olivier graduated from the Electrical Engineering program in Polytechnique Montreal. He is a co-author of patent US8884949B1, describing the invention of a novel temporal filter implicating NI technology. An Ada advocate, Olivier actively curates GitHub’s Awesome-Ada list.