The Kernel Newbie Corner: What’s in That Loadable Module, Anyway?

1254

In the very near future, we’re going to attack the problem of debugging both the kernel and our loadable modules in real time. But in order to do that, we need to take a slight detour and dig further into the actual structure of both of those types of objects to see how they’re put together. Yes, this is going to be a bit dry but you’ll thank me for it some day. That’s my story and I’m sticking to it.

(The archive of all previous “Kernel Newbie Corner” articles can be found here.)

This is ongoing content from the Linux Foundation training program. If you want more content, please consider signing up for one of these classes.

Our Sample Loadable Module

As a starting point, consider the following loadable module in the source file m1.c, most of which you should recognize, with a few new wrinkles thrown in.

 #include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>

static int whatever;
static int answer = 42;

static char __initdata howdymsg[] = "Good day, eh?";
static char __exitdata exitmsg[] = "Taking off, eh?";

void
useless(void)
{
printk(KERN_INFO "I am totally useless.n");
}

static int __init hi(void)
{
printk(KERN_INFO "module m1 being loaded.n");
printk(KERN_INFO "%sn", howdymsg);
printk(KERN_INFO "The answer is %d.n", answer);
answer = 999;
return 0;
}

static void __exit bye(void)
{
printk(KERN_INFO "module m1 being unloaded.n");
printk(KERN_INFO "%sn", exitmsg);
printk(KERN_INFO "The answer is now %d.n", answer);
}

module_init(hi);
module_exit(bye);

MODULE_AUTHOR("Robert P. J. Day");
MODULE_LICENSE("GPL");

As you’ve done before, create the corresponding Makefile, compile the module, and load it and unload it to make sure it works. So far, so good. Now we’re going to take a closer look at the actual executable file m1.ko to see how it’s constructed.

What is This “ELF” Thing Of Which You Speak?

As a first step to deconstructing the loadable module file, we can ask what type of file it is thusly:

 $ file m1.ko
m1.ko: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$

¬†What this tells us is that this file was (obviously) built for a 64-bit architecture and is of type “ELF”, which stands for “Executable and Linking Format,” the standard format for executable files these days. That also means that the file consists of several different “sections” which keep track of the different kinds of content inside it.

Most developers are already aware of the most basic types of sections in a regular executable file:

  • text: the executable object code,
  • data: initialized data, and
  • BSS: “Block Started by Symbol” or, as most people know it, uninitialized data which–unlike the first two sections — takes up no space in the executable file and is allocated only at run time.

None of the above should come as any big surprise to a developer, who should be familiar with the basic concept of text, data and BSS sections of an executable file. But when it comes to your loadable module, it gets so much more complicated than that.

What’s With All That “__init” Stuff Again?

What makes a loadable module (and, for that matter, the kernel image itself) more complicated is that there is more than one type of both text and data, and the ELF-format file has to keep track of those various types.

Recall from a previous column that your module entry routine can be tagged with the __init attribute, whose purpose is to identify code that can be discarded after the module is loaded so as to not waste any kernel space. That attribute can be applied not only to your entry routine, but to any entry “helper” routines you might write just so your main entry routine doesn’t become a single massive chunk of code–no one says you can’t break it down further, at which point you might want to tag those helper routines as “__init” as well. And that means that those routines aren’t simply “text” routines any more–they’re a particular type of text routine that must be treated differently by the loader.

Similarly, your module exit routine can be tagged with __exit to specify that it also has special properties–that it normally has to be kept around for unloading, unless there is absolutely no possibility that this module code will ever be unloaded (if, say, your module code is built into the kernel or your kernel isn’t even configured for unloading). In other words, this is yet another type of text that needs to be tracked independently. But wait–it gets so much better.

If you examine the code above, you’ll notice some data definitions that have been tagged with __initdata and __exitdata, which (not surprisingly) identifies data objects that have equivalent properties in that they can be discarded under certain circumstances once they have no further value. And all of that has to be tracked by the ELF file, which leads us to finally ask–what exactly is going on in that file?

Poking Around with “objdump”

The easiest way to start examining an ELF file is to first ask to see its various sections, at which point a lot of things become obvious (the output below is severely truncated so make sure you run the command yourself):

 $ objdump --section-headers m1.ko   [show me the sections!]
...
Sections:
Idx Name Size VMA LMA File off Algn
0 .note.gnu.build-id 00000024
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .text 00000000
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .exit.text 0000002e
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
3 .init.text 0000003e
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
4 .rodata.str1.1 00000057
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .modinfo 000000b4
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .data 00000000
CONTENTS, ALLOC, LOAD, DATA
7 .exit.data 00000010
CONTENTS, ALLOC, LOAD, DATA
8 .init.data 0000000e
...
11 .bss 00000000

And suddenly, everything falls into place, as you can see that different types of text and different types of data are distinguished by placing them into different “sections” of the ELF file, which allows the module loader to deal with all of the content in a single section in one operation–say, throwing away the entire “.init.text” and “.init.data” sections once the module is loaded.

You can get a list of the entire symbol table of the ELF file and the associated sections of those symbols with (output deleted as it’s fairly lengthy, so run it yourself):

 $ objdump -t m1.ko 

or you can pick on individual sections to see their contents with, say:

 $ objdump -t -j .data m1.ko

m1.ko: file format elf64-x86-64

SYMBOL TABLE:
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l O .data 0000000000000004 answer


$ objdump -t -j .init.data m1.ko

m1.ko: file format elf64-x86-64

SYMBOL TABLE:
0000000000000000 l d .init.data 0000000000000000 .init.data
0000000000000000 l O .init.data 000000000000000e howdymsg

and so on.

Exercise for the reader: Using the command above, take a closer look at the other sections of your module ELF file, and see how they match up with the different kinds of text and data in your source file. And if you’re feeling really ambitious, check out the readelf command as well, which supports a lot of the same operations. List the available command options with either of:

 $ objdump --help
$ readelf --help

So How Do Those Sections Work Again?

If you really want to know how the linker separates your module content into the different ELF sections, you can check the kernel header file <linux/init.h>, where you can see numerous (self-explanatory) preprocessor definitions of the form:

 ...
#define __init __section(.init.text) __cold notrace
#define __initdata __section(.init.data)
#define __initconst __section(.init.rodata)
#define __exitdata __section(.exit.data)
#define __exit_call __used __section(.exitcall.exit)
...

That header file should give you a good idea of just how many different types of text and data there are when it comes to kernel programming.

So Where Were We Going Again?

Having laid the foundations for examining the structure of ELF files, we’ll use all this next week to debug both your loadable module and the kernel itself, but there’s something you’ll need to do between now and then.

Before next week, you’ll need to configure a new kernel that you can boot and, regardless of how else you configure it, make sure you select the following two config options:

  • CONFIG_DEBUG_INFO, for embedded debugging information, and
  • CONFIG_PROC_KCORE, for /proc/kcore support.

Once you’ve done that configuration and build, you should, of course, end up with a new vmlinux kernel image file at the top of your kernel source tree and, even without booting it, you can use objdump on that new kernel image file just the way you did on your loadable module file, to get an idea of the internal ELF structure of the kernel file itself. Because that’s where we’re going to pick things up next week.

Readers that want to dig even further into the format of an ELF file can check out this article by Kernel Newbie mailing list regular Mulyadi Santosa.

Robert P. J. Day is a Linux consultant and long-time corporate trainer who lives in Waterloo, Ontario. He provides assistance to the Linux Foundation’s Training Program. Robert can be reached at
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
, and can be followed at http://twitter.com/rpjday.