Thursday, February 26, 2009

Open source vs open source

This piece arose out of a discussion on a mailing list I'm on. The discussion was about, among other things, why open source projects fail. By the time I finished a particular email, I realized I was basically blogging (not coincidentally, my output to that particular list dropped considerably when I started blogging here). Here's what I wrote, toned down a bit and edited lightly to take the later discussion into account:

It occurs to me there are two kinds of open source. First, there is stuff someone put together and threw out to the web. Much of this of limited value. There might be a good idea in there, but chances are it's not really new, or it's not fully worked out. It might or might not be well coded. It's probably indifferently documented. The developer works on it whenever, so there's no release schedule. For my money, such projects don't fail -- they never really start.

Then there are projects like Apache, or Mozilla, or Eclipse or Red Hat, or the various Google offerings, that are developed and supported full time by some sort of durable entity [Later in the discussion I unwisely called this a corporate entity, in a legalistic sense. This proved to have too many overtones, so I switched to "institution," which is still not ideal.] Generally there's commercial money in it one way or another, but the important quality is that there are numbered, scheduled releases and there are multiple people working on it as part of their day job.

But what about Linux, Python, Perl, the GNU tools and such? They may have started out in the first category [Re-reading, no. They weren't just thrown out on the web. They were generally well along before the world saw them -- but see the Postscript below for an interesting twist], but at some fairly early point the person behind them made a conscious decision to move to the second. Linus could have decided "Hey, that kernel thing was cool. I think I'll do something else," but instead he spearheaded the move from 0.9.x to 1.0.x (note -- release numbers) and has stuck with it up to 2.6.x. The Linux kernel arguably has one of the least formal and most distributed development processes of the major projects I'm aware of, but even then there is a single gateway for significant changes and if Linus should get hit by a bus, there are known people who could take over.

Python's Benevolent Dictator for Life, Guido van Rossum, works at Google, where he spends half his time on Python (evidently 50% for Guido is 20% for everyone else). BDFL Larry Wall's early development of Perl took place during his employ at JPL. While Perl is another of the less formal examples, there is still an elaborate structure around the development and release of Perl.

Rms took the more formal route, founding a foundation and drafting the famous GNU licenses. Being rms, he secured his own money for it, some of it thanks to a grant from another foundation, the MacArthur foundation. This is not a path lightly traveled, but the destination, again, is an institution dedicated to supporting the software.

In short, whenever something significant has happened in open source, it's because someone explicitly pushed for it. The actual coding, testing, documentation etc. might be distributed and more or less volunteer, but at the bottom (or top, if you prefer), there is a small, single point of control. This point of control tends to become institutional, that is, an entity distinct from any individual, fairly quickly.

There is a "religious" aspect to open source that many people, including myself, instinctively distrust. It relates to the notion that open source stuff "just happens" and that if you just throw stuff out on the web, or maybe even just hope it will happen, you'll magically get a robust, coherent and useful product. Experience shows, not surprisingly, that this just doesn't happen. Generally, a project has to start with a robust, coherent and useful product before a culture can emerge around it.

PostScript: My pondering on this topic keeps returning to Linux. On the one hand, it definitely supports the idea that good free software doesn't just emerge out of the web, but requires the ferocious dedication of a single person or small group. On the other hand, it has had a less romantic version of the idea of informal, distributed development in it from the beginning. Linus's GIT source control system, which deliberately has no central repository, is a more recent manifestation. In this light, it's interesting, not to mention somewhat amusing, to read Linus's original Usenet post to the world:

Hello everybody out there using minix -

I'm doing a (free) operating system (just a hobby, won't be big and professional like gnu) for 386(486) AT clones. This has been brewing since april [i.e., a few months], and is starting to get ready. I'd like any feedback on things people like/dislike in minix, as my OS resembles it somewhat (same physical layout of the file-system (due to practical reasons) among other things).

I've currently ported bash(1.08) and gcc(1.40), and things seem to work. This implies that I'll get something practical within a few months, and I'd like to know what features most people would want. Any suggestions are welcome, but I won't promise I'll implement them :-)

Linus (torvalds@kruuna.helsinki.fi)

PS. Yes – it's free of any minix code, and it has a multi-threaded fs. It is NOT portable (uses 386 task switching etc), and it probably never will support anything other than AT-harddisks, as that's all I have :-(.

1 comment:

David Hull said...

Note to self: later I assert that releases trend to have a definite schedule it definite content, but sending if ever both.