The Linux kernel is the world’s largest collaborative development project. Almost 3,000 individual contributors work together to create and maintain an operating system kernel that works on everything from wristwatches and mobile phones to mainframes, along with all the peripherals imaginable for each platform. Linux creator Linus Torvalds sits at the top of a loose hierarchy of kernel maintainers and acts as final arbiter for what does or does not get included.
So how does one go about contributing a substantially new technology to the kernel?
Sage Weil was working on a distributed file system for Linux as part of his PhD research at University of California, Santa Cruz. This was before the advent of the buzzword “big data”, and therefore before things like Hadoop or Amazon’s S3. His research into distributed fault tolerance led him to the conclusion that the best way to manage a clustered filesystem was at the kernel layer, rather than higher up in userspace. He called his filesystem “Ceph” — a shortened version of Cephalopod — as a nod to the “highly parallel behavior of an octopus.”
Weil was no stranger to open source or the Linux community. In 1996 he was one of the founders of the web hosting company Dreamhost. As his research progressed, he knew he’d need to get his kernel components integrated upstream if they were to have any real chance of practical application: no one was likely to compile a custom kernel just for a clustered file system.
So Weil did what any good hacker would do: he joined a couple of kernel-related mailing lists and started watching how things work.