Transcript of IRC interview with Apache guru Ken Coar

229

Author: Robin 'Roblimo' Miller

On Thursday, January 29, DevChannel.org, in conjuction with irc.slashnet.org and its ‘sister’ OSDN sites NewsForge and Slashdot, hosted an IRC interview with IBM software engineer Ken Coar, a leading author, speaker and Apache visionary. Ken is a core developer of the Apache project, and a vice-president and director of The Apache Software Foundation . He’s also the author of Apache Server for Dummies and Apache Cookbook, and a contributing author to Apache Server Unleashed . The following article is a (lightly edited) transcript of that interview.[Questions] dv asks: are there any plans to have apache run on BeOS?

[Ken Coar] Apache 2.0 already has an MPM for BeOS. As for 1.3.. if it doesn’t run there now, it’s not likely to be retrofitted.

[Questions] dv asks: can you tell us some linux-specific optimisations to make apache run faster?

[Ken Coar] Well, there’s a chapter on performance in the Apache Cookbook that Rich Bowen and I just finished writing. I can’t think of any Linux-specific ones off the top of my head. Things are *very* situational and depend on traffic patterns.

[Questions] Diablo-D3 asks: Ken, when did you start working on Apache, and what was it like?

[Ken Coar] I was first exposed to Apache (which was just the Web server back then) in 1996. I was using NCSA’s httpd at the time. After looking at the Web site, I scrounged the eddress of one of the developers and sent him private email asking whether it could do ‘x’.

He told me ‘no’. ๐Ÿ™‚

He also mentioned the development mailing list, but kinda discouraged me from getting on it because ‘there were too many lurkers already.’ I joined anyway, lurked for a couple of weeks, and then jumped in. I made enough of a nuisance of myself with suggestions and patches that Roy Fielding suggested I be given commit access four months later.

It was a much more closely-knit community back then, I think, in part because of the way the development process was handled.

[Questions] liamfoy asks: Thanks for taking your time in answering our questions. I am just wondering what you see next for the apache project ?

[Ken Coar] Which part of Apache? It’s more than just the Web server now.

If you’re asking specifically about the Web server, I’m not sure where it’ll go next. Not only am I not as active in development as I used to be (or would like to be again), but it’s very evolutionary — it depends on the fiercest of the itches bothering people.

Performance and scalability are always good bets.

[Questions] Morbus asks: with mail servers like sendmail and postfix increasingly having features to block spam and dictionary attacks, do you ever envision Apache having a proactive security stance to SQL or similar forms of GET/POST protection (ala mod_security).

[Ken Coar] Certainly. However, what I envision doesn’t necessarily have any effect on what will actually happen. ๐Ÿ™‚

Nothing is free; security costs something, just like everything else. Usually it’s convenience, but it’s also frequently performance. How fool-proof do you want to make it? The MTAs can’t keep users of Outbreak Express from opening explosive attachments.

A lot of the sensitive issues about the Web have to do with dynamic pages — which is a different model than email. You have n different people writing the dynamic pages (note that I don’t say ‘designing’ them ๐Ÿ™‚ — and I think that n+3 of them aren’t software developers.

So.. yes, I would like to see (and hope I will) more security awareness and instrumentation and control in the Web server. But it’s a moving target and en endless battle.

[Questions] wendall911 asks: What do you think about caching utilities like Turck-mmcache, et al and api such as memcached for use in optimizing mod_php and other software used in conjunction with Apache.

[Ken Coar] I can’t speak to the specifics you mention, but in general I think caching is a good thing.
It’s not as simple as that, though, because writing a *good* cache system is a very complex task, and it’s very easy for major bogosities to creep in — like caching the wrong stuff.

More and more of the content on the Web is dynamic, and most of it either isn’t intended to be cached, or can’t be cached safely, or the content developers don’t know how to include caching instructions as part of their pages.

When it’s done right, though, it can take a tremendous load off the World Wide Wait.

[Questions] Juanjo asks: I would like to have an idea on the number of people involved in the development of the Apache web server. How many of them just code, how many design, etc. Thanks!

[Ken Coar] It varies. When it’s relatively quiet, there are maybe a dozen people active on the development list. When it’s noisy, that number can quintuple.

[Ken Coar] Hang on, let me get a number..

There are currently about 700 eddresses subscribed to the httpd development mailing list (which is where all development happens). Many of those are not developers at all, but just interested parties. Some are developers for organisations that provide after-market support or add-ons for the server. Quite a few are past developers who are currently inactive for whatever reason. When it comes to working out a design, I’d guess there are maybe no more than a dozen people who really get into it. At most.

There are about 250 eddresses subscribed to the documentation project list, of which maybe 2^4 are active. At a guess. Many of those are also on the development list; people watching the development in order to update the docco appropriately.

[Questions] geoff asks: Are there any plans to branch the webserver into distinct sub-groups? For example ApacheLite, ApacheXP, ApacheAdvancedServer. It is a safe bet to say that most users of apache don’t use all the functionality.

[Ken Coar] Not as far as I know, but I’m ‘way behind on the development mailing list. I just got back from a long trip and found 34’000 messages in my inbox. ๐Ÿ™

However, one of the advantages of the Apache Web server’s modular design is that you can mix and match precisely what you want. so I don’t think there’s really any need for subprojects of the code. Working groups which focus on the best mix of modules for specific purposes, maybe.

[Questions] dv asks: will apache ever support .NET or is that totally against the apache philosophy?

[Ken Coar] (laughs) The Apache philosophy is not anti-Microsoft. It’s pro-openness.

(Again, I assume the question is referring specifically to the Web server.)

So if the Web server is to support .NET (about which I know NeXT to nothing), it would probably be done either by someone writing the appropriate modules for it, or a sideways engine like Apache Tomcat. The Perl slogan is, ‘there’s more than one way to do it.’ I suppose one possible Apache slogan would be, ‘do it if you want.’ ๐Ÿ™‚

[Questions] kraupu asks: You have just said that you don’t have much time to work on Apache project? What is your main job?

[Ken Coar] I’ve been trying to figure that out for the last couple of years. ๐Ÿ˜€

Right now, I spend a lot of time dealing with email. And a lot of time dealing with ApacheCon. And a lot of time dealing with conference presentations.

However, my employer pays me primarily to work on solving difficult customer problems and providing consulting internally. They pay for the rest sort as an adjunct. They get a little restive if it appears the customer aspect is getting short-changed. I don’t blame them.

I don’t necessarily *like* it, but I don’t blame them.

[Ken Coar] People ask me a lot, ‘why do you have so much email?’ I get over 2’000 messages a day. A lot of that is spam (googling for my eddress will come up with tens of thousands of hits, thanks to the mailing list archives, so I’m on every spam list ever compiled).

[Ken Coar] A lot of it is mailing list moderation; I moderate dozens of lists. So dealing with the spam load there is an issue.

Yes, I use spam tools, like Mozilla’s junk controls and SpamAssassin. But those come up with false positives every now and then, so even though they make it easier, there’s still some eyeballing necessary.

A lot of my mail is just for filing for my later reference or perusal.

Oh, and in case anyone is wondering who is busy employing me.. it’s IBM.

[Questions] quinlan asks: I believe the structure of a open source project has a large impact on whether or not the project succeeds. What are the biggest strengths and weaknesses of Apache’s structure and how has it changed over time?

[Ken Coar] H’m.

I think one of the biggest strengths is the opportunities provided for just about anyone to participate, and the mutual respect that exists between most of the players.

[Ken Coar] I think one of its weaknesses is a tendency to ‘go for the code’, potentially sacrificing quality.

For instance, before 1998 we used a model called ‘review-then-commit’ (RTC).

Changes were proposed on the mailing list, and they required positive feedback from at least three people who had tested the changes and found them good. While that promotes quality, it can also slow progress. The list of open issues, and the votes-so-far on each one, was maintained by an individual who mailed it out semi-regularly. (Usually weekly.)

Two changes have been made here. The first is the introduction of commit-then-review (CTR), which means, ‘unless you expect it to be controversial, go ahead and commit whatever you like, and anyone who disagrees can complain afterward.’ That speeded up development, but I think some quality was sacrificed.

The other change in handling was that the list of open issues stopped being maintained and sent by a person, and instead became a file in the source control system, automatically emailed. The negative aspect of that is it then became no one person’s responsibility to add submissions from new people to the status file, and hence a lot got dropped — essentially ignored.

But back to the question. The procedures we use to handle the development, and the way we interact with each other, are among our greatest strengths. The tendency to be cliquey and exclude outsiders is one of our greatest weaknesses.

[Ken Coar] All IMHO, of course.

[Questions] DaMouse404 asks: do you see any features in the IIS that you would like in Apache?

[Ken Coar] (chuckles) I can’t really say, since I don’t use IIS and didn’t realise that it actually had features.

[Questions] w32kleza asks: Are there any plans to continue sharing code between the NCSA Server and Apache?

[Ken Coar] I don’t think so. To the best of my knowledge, the NCSA httpd has been retired and moribund for years, and the Apache developers have already mined it for interesting ideas.

That’s mined as in ‘dug up’, not as in ‘blow up’.

The Apache httpd project doesn’t import any code from NCSA AFAIK, although we owe that project a tremendous debt and vociferously admit it. If there’s any life left over there, I don’t know if they import any Apache stuff or not.

[Questions] quasi asks: what are the main points in the new license?

[Ken Coar] Whoo. Well, let’s see. One of the big new points is that it has been reworded so that anyone can apply it to his code without having to change ‘apache’ to ‘myfoo’. Another big point is that it now includes text about patents and similar intellectual property issues, intended to protect both the developers and the users from submarined claims. And probably the third Big Thing in the 2.0 licence is that it includes a contributor licence. That is, if you submit anything to the ASF based on our own code, it’s implicitly covered as though you had signed one of our CLAs (Contributor Licence Agreements).

http://www.apache.org/licenses/LICENSE-2.0.html. There’s a .txt version also for easier inclusion.

We clearly acknowledge licence-by-reference now, as well, which means the full text doesn’t need to be included in every file; a pointer is sufficient. It always was, but we weren’t clear about it.

[moderator] We’re going to cut off questions now; just answer ones we already have, then call it a day.

[Questions] itchyArse asks: How do modules such as mod_dav get brought into the Apache umbrella? Is it by popularity? Or maybe due to a group of peoplethinking it is useful for a large # of people?
–>
[Ken Coar] Typically by there being someone who wants to champion it, plus the code being good quality. We’re very careful to ensure that any code that comes in has at least one person (preferably more) on the development team who is capable (and willing) of supporting and maintaining it. We don’t accept code dumps.

Some code won’t be accepted no matter what, either because of licensing issues (e.g., GPLed code) or other IP factors, or because the people agitating for it have seriously pissed off all the developers. Which almost assumes that the ‘we have a maintainer’ requirement isn’t being met. ๐Ÿ™‚

[Questions] dwa asks: How does it feel like to beat a costly webserver like IIS, despite all the efforts of Microsoft (think of advertising and so on) to push IIS through?

[Ken Coar] In a word: great!

However, we don’t have the goal of beating anyone. Our goals are quality and scratching our itches. It just happens that those two seem to beat commercial goals like an arachnophobe beats a spider. He just can’t help himself. ๐Ÿ™‚

[moderator] Thank you all for participating. That’s it for today — we’re out of time. We now return to our regularly scheduled (whatever).

[Ken Coar] Don’t forget to buy Apache Cookbook! Take two, they’re small! ๐Ÿ˜‰

Slashnet would like to thank Ken Coar, Roblimo (moderator) and the Slashdot crew for this forum.

Category:

  • Open Source