Saturday, December 22, 2007

Eyeballs and shallow bugs

Eric S. Raymond has asserted that "given enough eyeballs, all bugs are shallow", a principle he calls Linus's law after Linus Torvalds (my fingers want to type "Linux Torvalds").

How many eyeballs are enough? How many eyeballs are available? What does it take to get a "shallow" bug fixed, checked in and tested? What's a bug, anyway? A few meditations:

How many eyeballs are enough?
Suppose an excellent kernel hacker has a 90% chance of nailing a given bug. What are the chances that two excellent kernel hackers can nail the bug? Well, it's not 180%. It will range from 90% (if the second doesn't know anything new about the particular problem) to 100% (if the second knows everything the first one doesn't).

If the two are completely independent sources of information there's a 99% chance one or the other will nail it. But what are the odds two people got to be excellent kernel hackers by completely independent routes? Now, what are the odds that a bunch excellent kernel hackers, quite a few more reasonable kernel hackers and a horde of non-specialists could nail the bug? Pretty good, I'd say, but not 100%.

How many are there?
I've been doing software for a while now. I've built a few Linux kernels (roughly the equivalent of changing a tire on a car), and I've looked at small portions of the source (roughly the equivalent of opening the hood, pointing and grunting). If some subtle race condition should creep into the next version of the kernel, the odds that I could contribute something useful to the conversation are approximately zero (the automotive equivalent might be, say, being able to help fix a problem in a Formula One engine design).

The most qualified people in such a situation are the small and dedicated core of kernel maintainers and whoever put in the changes that turned up the problem. These may well be the same people. I know a fair bit about software in general, and even a bit about race conditions in general, but I know essentially nothing about the details of the kernel, the design decisions behind particular parts, the hidden pitfalls and so forth.

This being open source, much of that information is available, directly or indirectly. The limiting factor is the ability to absorb all of the above. This takes not only skill but time and dedication. The natural consequence is that, for at least some bugs, there just aren't enough eyeballs available to make them "shallow". Instead, someone will have to expend considerable brain sweat figuring out what happened.

[Another not-infrequent case: Lots of people see a bug, but no one can quite nail down what's causing it, much less suggest a fix. Filing good bug reports takes practice, just like writing good code does. Eliminating all the variables in a typical desktop environment takes time, even for someone with lots of practice. As a result, the people who could fix the bug don't have enough information to go on and probably have bigger fish to fry.]

What does it take to get a shallow bug fixed and tested? Suppose that the broken code in question (kernel or otherwise) has passed by enough eyeballs that someone has said "Hey, that's easy, you just need to ..." That person puts in a fix. Are we done? No. At a minimum, someone needs to test the fix, preferably someone other than the fixer. Someone should also look over the code change and make sure it fits in well with the existing code. And so forth. Open source doesn't remove the need for good software hygiene. If anything, it increases it.

What's a bug, anyway? Suppose not just one person steps up with a fix to some bug. Suppose two or three people do. Unfortunately, they don't exactly agree on the fix. Maybe one wants to patch around the problem, one has a small re-write that may also fix some other problems, and another thinks the application wouldn't have such bugs if it were structured differently. Someone else might even argue that nothing needs to be fixed at all.

Expediency will tend to favor the patch, and expediency is often right. The small re-write has a chance if the proponent can convince enough people that it's a good thing. The re-structured system will probably need to be a whole new project, potentially splitting up the pool of qualified eyeballs.


So does this mean that open source is a crock? Not at all. Most of the problems I've pointed out here aren't open source things. They're software things. Open source offers a number of potential advantages in dealing with them. One I think may be overlooked is that writing a system so that anyone anywhere can check it out and build it, and that several people can work on simultaneously and largely independently, enforces a certain discipline that's useful anyway. If your code's a mess or no one else can build it and run it, you're not going to get as many collaborators.

On the other hand, open source isn't a magical solution to all of life's problems, and there are arguably cases where you just need someone to say "Today we work on X," or "We will not do Y." Strictly speaking, that kind of control is a separate question from whether the source is freely available, but Linus's law assumes that eyeballs are not being commanded to look elsewhere.

So is Linus's law a crock? Not at all. It captures a useful principle. But like most snappy aphorisms, it only captures an ideal in a world that's considerably messier and more intricate.

[A few years later, a couple of striking examples turned up, in the form of Heartbleed and Shellshock -- D.H. Dec 2018]

No comments: