Garrett’s LinuxCon Talk Emphasizes Lessons Learned from Android/Kernel Saga

1345

 

A LinuxCon session led by Red Hat’s Matthew Garrett discussed the lessons learned from Google’s ongoing attempts to include power-management code in the mainline Linux kernel… and revealed there’s still some emotions running high in the debate.

If there was any doubt that feelings are still running high regarding Android code’s inclusion into the mainline Linux kernel, those doubts were quickly dispelled when Red Hat developer Matthew Garrett asked an audience member to leave the room as an argument began brewing between that audience member and another during the Q&A session of Garrett’s talk at the 2010 LinuxCon in Boston today.

Garrett’s talk “Android/Linux Kernel: Lessons Learned,” outlined the 18-month saga surrounding the attempted inclusion of Android power-management code into the Linux kernel–a saga that to date still has the code outside the kernel proper.

Garrett, whose field of expertise is power management at Red Hat, nonetheless admitted that when he first saw the patch submitted by Android, he didn’t even know what the patch was trying to fix and what specific functions were being called in the patch. New undefined terms, such as “wakelock” and “earlysuspend,” were intermixed in the original January 2009 patch submittal to the mainline kernel, making the patch very hard to understand.

It wasn’t just technical problems with Google’s submittal, Garrett explained. There were questions about the very motivation of the patch: with the undefined terms, kernel developers were unsure what problem was being addressed and if the problem would even apply to the Linux kernel as a whole.

Faced with these obstacles, the patch was resubmitted twice more in February 2009, with changes that answered some of the initial questions about the patch, but not all.

A bulk of Garrett’s talk centered around the technical aspects of the patch itself, namely the aforementioned wakelock and earlysuspend functions.

Wakelock, in a nutshell, is a solution used by Android to avoid a possible race condition (which can derail the scheduling of internal software/hardware events in a device) due to conflicting events around power management. The problem stems, Garrett said, from the fact that the Android platform’s application development is so open, which has the advantage of attracting many more app developers, but (at times) the disadvantage of getting apps on devices that might be less than optimal.

From the power management perspective, the Android approach is to assume that apps are not to be trusted and controls are put in place to stop the processes of any app that runs out of control with CPU resources and thus overly drains the battery. But having the system ultimately have control of power management can lead to those race conditions: what happens, for instance, when a phone call comes in just as the system decides to suspend itself?

According to Garrett, as that point, a wakelock can intervene and let the device take phone call, and then upon the user hanging up, release the system so it can go back to sleep. earlysuspend, he added, is a similar way to keep a system properly “awake” even if the screen is shut down for battery conservation.

The problem with this approach, from the Linux kernel developers’ point of view, was that such an inclusion into the Linux kernel would require modification of any driver that generates wakeup function calls, and the wakelock patch would not benefit any platforms that weren’t wakelock aware. Also, many kernel developers perceive this kind of power-management issue as being something for userspace to take care of, and thus were resistant to having the kernel solving what they thought was an application issue.

After over a year of sometimes heated discussion on the kernel mailing lists, the Android and Linux kernel teams agreed to meet at the most recent Linux Collaboration Summit in April 2010, which in turn led to another patch submitted in May.

Garrett outlined the lessons that should be learned by both sides of the argument.

For contributors, it’s important that patches are submitted with a clear understanding of that the patch is supposed to do or fix. Terms within the patch should be well-defined and documented. Also, patches with multiple functions mixed within should be avoided. Single patch, single function is preferred.

Garrett also recommended that getting complex patches such as this accepted is a lot easier when there’s a recognized name associated with the patch. If there are questions, major patches coming from relative unknowns can be misperceived as someone’s error rather than being taken seriously.

Taking these lessons to heart, Garrett said, is a constructive way to build upon even perceived misfires such as this.

As for Android’s patch, a minimal solution has been introduced to the mainline, but their proposed changes are still pending.