Linux Quality Database: One man’s quest for kernel quality

38

Author: JT Smith

By chromatic
Michael Crawford dreams about a world
of free and high-quality software. His current mission, the Linux Kernel Quality Database,
intends to make that true of the Linux kernel. It’s not that he finds fault
with the code. Instead, he sees its success and flexibility as an opportunity
to meet some very difficult quality assurance needs.

Quality, as he sees it, implies two things. At the most basic level,
software should be free of fatal errors, and it should perform its advertised
functions correctly. At a higher level of quality, software should be free of
annoying flaws, and closed bugs should stay resolved. A defect uncovered in
one place should be fixed everywhere else it occurs. It’s easy to write code
that looks correct, but nothing takes the place of actually running it.

Organized by programmers, for programmers

As more people discover Linux, the potential for end-user feedback
increases. More users mean more configurations, more hardware in use, and
more interactions, all of which may uncover subtle bugs. Big projects like
Mozilla, GNOME, and KDE have comprehensive bug databases, but nothing similar
exists for the kernel. Part of that comes from its unique culture.

Much of the kernel development discussion takes place on lkml, the Linux
kernel mailing list. It can easily see a thousand messages a week. Keeping up
is a Herculean effort. As Crawford discovered, it can overwhelm an end-user
only interested in the status of a bug or a feature. Where would a harried
system administrator or application developer start to look for a solution?
There are voluminous lkml
archives
, and there’s always Google,
but locating and aggregating the latest news is a chore.

It’s not that developers don’t crave good feedback, but testing is hard,
unglamorous work. Crawford points out that Linux creator Linus Torvalds has mentioned the need for more testing several times, but it doesn’t compare to the thrill of writing new code. Like debugging or writing documentation, testing can be tedious and time-consuming. Beyond that, it takes a unique set of talents to produce good tests and automated test suites.

Adding users to the mix

That’s where Crawford comes in. If the Quality Database takes off, it will
allow users and developers to correlate defects with kernel versions. The more
people who participate, the easier it will be to focus in on the exact problem
and create a solution. Even a report as simple as, “My network card doesn’t
initialize under 2.4.4 with the Tulip driver but worked fine with 2.4.3,” could
narrow the issue down for other users, especially if it eventually included an
explanation or a fix.

The reaction so far has been mixed, at least from the kernel developers.
Several responded positively to the initial announcement. Others wanted feedback
from Torvalds and chief kernel hacker Alan Cox before signing on. Crawford has yet to receive a formal blessing.

Part of the difficulty comes from the kernel development process itself.
Torvalds has resisted putting the kernel into a version control system, preferring
to work sequentially through e-mailed patches. Several lieutenants hold sway
over important subsystems, like the virtual memory system or the filesystem.
Despite this apparent chaos, things come together. Still, interested developers
unfamiliar with the subtle protocols of lkml have to adapt to the ultimate
bazaar, if they want to see results.

Attempting to avoid several recurrent flamewars, Crawford emphasizes his
desire to work within the current system. “[It’s] not my plan to try to force a
bunch of big-company software process into the Linux kernel development. I want
it to work as well as possible with what they already have.” Instead of
targeting existing developers, he wants the database to act as a sort of bridge
between users and kernel hackers. To the users, it will be a repository of
defects, versions, and solutions to common problems. To coders, it will be
a source of error messages, configurations, and defects from the field.

The road forward

While recruiting several motivated people to run informal tests by hand
will add valuable data, other ideas can refine the process. One correspondent
brought up the issue of test coverage. How can developers know that every
component — indeed, every line of the kernel has a test, somewhere?

An article on the database site links to several automated test suites for
userland software, including Mesa and Python. These exercise certain parts of
the kernel and system libraries. The closest thing in kernel space proper comes
from SGI. It’s nowhere near comprehensive, but it’s a place to start. One can
almost imagine a thousand boxes downloading the latest -ac kernels and running
nightly smoke tests. (The Perl
Smokers
group has a similar system already in place for Perl’s development
branch.) If testing a new release and reporting failures is as easy as “make
test,” this will produce an incredible amount of valuable data in a very short
time.

Of course, there are millions of lines of code, several different
architectures, thousands of supported and slightly-different pieces of
hardware, and hundreds of configuration options. Even worse, consider the
experimental options, system libraries, BIOS bugs, and distribution variations
that must be taken into account. Working that close to bare metal leaves
little room for a safety net. But the world needs idealists, and writing a
kernel is also hard work.

Getting involved

When the Quality Database itself goes live, it will provide a Web interface
to report and to read reports of failures and successes. Ideally, this will be
the first line for puzzled users and administrators. The more data a report contains,
the more likely it can pinpoint the source of a bug or a solution.

Recent Linux adopters may fear the imposing-until-you-do-it world of
kernel
recompilation
, but that’s the easiest way to start testing. If a new kernel boots
on your system, it’s already passed the most important test. (The people who
package the kernel for their distributions often produce useful patches trying
to make things work for as many people as possible.)

For people with more know-how and motivation, Crawford’s article on validating
the kernel
links to several available tests. There’s no Holy Grail of
comprehensive kernel suites yet, but the need and the opportunity are there.
Programmers interested in kernel hacking could easily get their feet wet.

Clearly, this is an idea whose time has come. The most compelling
part of Crawford’s vision is that it’s within reach of average users. It
doesn’t require a degree in computer science or hours of free time, just a few
reports here, and a few tests there. The more people who invest in
improving software quality, the greater the payoff.

Category:

  • Linux