The Kernel Newbie Corner: Kernel and Module Debugging with gdb

12834

This week, we’re going to demonstrate how to do some very basic debugging of both your running kernel and a loaded module using the gdb debugger running in user space. But before you get too involved here, you must review last week’s column so that, by the time you return here, you understand that you should have done all of the following before going any further:

  • Configured a fairly recent kernel source tree for your host, including selecting the configuration options CONFIG_PROC_KCORE and CONFIG_DEBUG_INFO,
  • Built and installed the corresponding kernel and modules, leaving the ELF-format vmlinux image file at the top of the source tree
  • Rebooted to the new kernel,
  • Checked that the file /proc/kcore does indeed exist
  • Installed the gdb debugger.

Once all that’s done, you can carry on. But not until then.

NOTE: This column is based heavily on the corresponding gdb debugging section from Linux Device Drivers, 3rd Edition, Chapter 4. Credit where credit is due. This is ongoing content from the Linux Foundation training program. If you want more content, please consider signing up for one of these classes. The archive of all previous “Kernel Newbie Corner” articles can be found here.

So What Exactly Are We About To Do?

What we’re about to do is demonstrate an admittedly hacky way of listing the values of various kernel variables in real time, both from your running kernel and from any of your loaded modules.

Note well, though, that this is not what you’d do in an ideal situation. There are numerous better ways to do kernel debugging, most of which we’ll get to in upcoming columns. But for a quick-and-dirty way of examining some of those kernel space values, as long as you have all of the prerequisites in the list above, what we’re about to demonstrate will work just fine.

NOTE: While we’re going to use gdb for this debugging, those familiar with gdb’s normal use in user space should keep in mind that you can use it for kernel debugging only in a limited sense. What we’ll be using it for is simply displaying information. In this context, you can’t do anything more advanced such as assigning values, setting breakpoints, single-stepping through kernel code or the like. Quite simply, you get to look, and that’s all.

So What Are My “Jiffies?”

As a trivial example of dumping something from kernel space, let’s pick on the current value of the jiffies variable (or jiffies_64 on a 64-bit system), which counts the number of clock ticks since system boot. (And when we display this value, don’t be alarmed if it looks totally out of whack, since it isn’t actually initialized to zero at boot time.)

And now, to work, where we will need root access and the vmlinux file that corresponds to the running system:

 # gdb vmlinux /proc/kcore   (start our debugging session)
... snip ...
(gdb) p jiffies_64 (print the value of jiffies_64)
$1 = 4326692196 (and there it is)
(gdb)

How about another one–loops_per_jiffy, which is calculated early in the boot process and is used as the basis for the infamous “BogoMips” value?

 (gdb) p loops_per_jiffy
$2 = 1994923
(gdb)

And there you have it. Most developers are used to invoking gdb on a user space executable and its corresponding core file. In a sense, that’s exactly what we’re doing above — invoking it on an executable (vmlinux) and its corresponding core file, which is supplied via /proc/kcore. Trivial, no?

And That’s It?

Of course not. So here’s what you need to remember when debugging your running kernel with gdb:

  • Your vmlinux file must match your running kernel, or symbols and addresses won’t match and you’ll get garbage during your gdb session. So after you build your new kernel, don’t forget to reboot to it.
  • The values displayed will represent what they were at the time of invoking gdb, so if you print the contents of, say, jiffies or jiffies_64 again during that same debugging session, it will be exactly the same, over and over. To refresh the core file contents, you need to run:
     (gdb) core-file /proc/kcore 

    every time you want updated values. (That won’t make any difference to the value of loops_per_jiffy, which is calculated early and never changes until another reboot.)

  • Unlike a loadable module, which has access to only those kernel symbols that were explicitly EXPORTed, your gdb session has access to all kernel symbols, even those declared as “static”. For example, consider the following definition from the kernel source file fs/char_dev.c:
     static struct kobj_map *cdev_map; 

    Clearly, cdev_map has static linkage and is completely inaccessible to any loaded module. But your gdb session can see it just fine:

     (gdb) p cdev_map
    $1 = (struct kobj_map *) 0xffff88011b878000
    (gdb)

    This is an amazingly handy feature.

  • Finally, you can examine the available symbols in kernel space by simply perusing the /proc/kallsyms kernel symbols file, as in:

     $ grep loops_per_jiffy /proc/kallsyms
    ffffffff815968e0 r __ksymtab_loops_per_jiffy
    ffffffff815aa295 r __kstrtab_loops_per_jiffy
    ffffffff815f0420 D loops_per_jiffy
    $ grep cdev_map /proc/kallsyms (even the static symbols)
    ffffffff818c25e8 b cdev_map
    $

But wait. There’s so much more. Because here’s where it gets interesting.

So What About Loadable Modules?

Once you realize that you can examine the values of symbols in the kernel symbol table from user space, it should be easy to see that you can just as conveniently examine some of the values in your loadable (and loaded) modules. After all, once you load a module, it’s running in kernel space so it should be just as accessible to a gdb session as everything else.

Consider the following loadable module (in the source file gdb1.c) that we’re going to load and examine (we won’t bother defining any “static” symbols from here on since everything that follows will apply equally to such symbols):

 #include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>

int rday_1;
int rday_2 = 20;
int rday_3 = 30;

EXPORT_SYMBOL(rday_3);

static int __init hi(void)
{
printk(KERN_INFO "Module gdb1 being loaded.n");
return 0;
}

static void __exit bye(void)
{
printk(KERN_INFO "Module gdb1 being unloaded.n");
}

module_init(hi);
module_exit(bye);

MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Module debugging with gdb.");

What we’ve done above is define a number of “rday”-prefixed module variables so you can verify that you can examine all of them equally well. Let’s use some of the tricks we learned last week to first examine the module’s symbol table:

 $ nm gdb1.ko
0000000000000000 r __kstrtab_rday_3
0000000000000000 r __ksymtab_rday_3
0000000000000000 r __mod_description27
0000000000000028 r __mod_license26
0000000000000040 r __mod_srcversion23
0000000000000080 r __mod_vermagic5
0000000000000068 r __module_depends
0000000000000000 D __this_module
0000000000000000 t bye
0000000000000000 T cleanup_module
0000000000000000 t hi
0000000000000000 T init_module
U mcount
U printk
0000000000000000 B rday_1
0000000000000000 D rday_2
0000000000000004 D rday_3

Note that, predictably, the uninitialized variable rday_1 ended up in the BSS section, while the initialized ones ended up in the data section. Just an observation.

You might also run any of the following to verify what ended up in some of the ELF file sections:

 $ objdump -t gdb1.ko            (all sections)
$ objdump -t -j data gdb1.ko (data section)
$ objdump -t -j bss gdb1.ko (BSS section)

In either case, what you should conclude from the above is that:

  • the variable rday_1 is in the module’s BSS section and should be printable

  • the variables rday_2 and rday_3 are both in the module’s data section and should be printable as well (exporting made absolutely no difference for what we’re about to do).

Keep all of the above in mind with respect to sections since that information is going to be necessary shortly.

Loading and Debugging the Module

Assuming you built that loadable module properly, load it and let’s see what happens. Immediately after it’s loaded, the first thing you can do is verify that the appropriate symbols are now in the kernel symbol file /proc/kallsyms file (gdb is not running yet):

$ grep rday /proc/kallsyms
ffffffffa00f4080 r __ksymtab_rday_3 [gdb1]
ffffffffa00f4090 r __kstrtab_rday_3 [gdb1]
ffffffffa00f456c D rday_3 [gdb1]
ffffffffa00f4568 d rday_2 [gdb1]
ffffffffa00f47c0 b rday_1 [gdb1]

That’s a good sign–we can see the variables, and they seem to be in the correct ELF sections. (Note that the difference between “d” and “D” above refers to whether those symbols are visible to loaded modules–there is no difference in their visibility to your gdb session.)

At this point, we can fire up gdb exactly the way we did last time:

 # gdb vmlinux /proc/kcore 

but if we try to print any of those variables, we get:

 (gdb) p rday_1
No symbol "rday_1" in current context.
(gdb) p rday_2
No symbol "rday_2" in current context.
(gdb) p rday_3
No symbol "rday_3" in current context.
(gdb)

That’s because we’re missing the last crucial step.

Adding a Symbol File to gdb

The problem is that the current gdb session has no idea about those module symbols because we haven’t educated gdb about where they are. For that, we need to pop over to the directory /sys/module/gdb1/sections, and check where the module’s various ELF sections were loaded into kernel space, and then pass that information to gdb:

 $ cd /sys/module/gdb1/sections
$ ls -A1
.bss (where the BSS section was loaded)
.data (where the data section was loaded)
.exit.text
.gnu.linkonce.this_module
.init.text
__ksymtab
__ksymtab_strings
.note.gnu.build-id
.rodata.str1.1
.strtab
.symtab
.text (where the text section was loaded)
$ cat .text .data .bss (the section addresses I care about)
0xffffffffa00f4000 (address of module's text section ...)
0xffffffffa00f4568 (... and data ...)
0xffffffffa00f47c0 (... and BSS)

and we can now tell gdb all about those sections thusly:

 (gdb) add-symbol-file .../gdb1.ko 0xffffffffa00f4000 
-s .data 0xffffffffa00f4568
-s .bss 0xffffffffa00f47c0
add symbol table from file ".../gdb1.ko" at
.text_addr = 0xffffffffa00f4000
.data_addr = 0xffffffffa00f4568
.bss_addr = 0xffffffffa00f47c0
(y or n) y
Reading symbols from .../gdb1.ko...done.
(gdb) p rday_1
$2 = 0
(gdb) p rday_2
$3 = 20
(gdb) p rday_3
$4 = 30
(gdb)

Note carefully how you add a module’s symbols to your current gdb session with add-symbol-file: you must add the address of the module’s text segment as the argument to the command, after which you need only add the addresses of whatever extra ELF sections you care about. If there was nothing in the BSS section you wanted to print, you could have omitted that option. (Note also that the literal reference to “…/gdb1.ko” is just a short cut to whatever the full filename of the module file is which you’ll have to supply.)

And when you’re done poking around, just “quit” out of your gdb session, then unload the module.

And That’s It?

Almost. Here are a couple more things to keep in mind when debugging with gdb:

  • You don’t need to restrict yourself to your module’s text, data and BSS sections. If there’s something in, say, the “.exit.data” section that interests you, you can add that section to the symbol table as well.
  • As with debugging symbols in the kernel image itself, the values you’re going to see for module variables are the values at the time of loading the core file /proc/kcore. If you’re interested in dumping a variable that might have changed since the module was loaded, you’ll have to once again reload the core file before every print command:
    (gdb) core-file /proc/kcore 

    Designing a simple module to demonstrate that last claim is left as an exercise for the reader. HINT: Use a writable parameter.

Next week: Debugging using sequence files in the /proc directory.