Wednesday, September 5, 2007

What killed parallel computing?

When I was an undergrad, parallel computing was the Next Big Thing. By "parallel computing" I mean a large number of CPUs that either share memory or a have relatively little local memory but pass (generally small) messages on a very fast local message bus. This is as opposed to distributed computing, where CPUs have lots of local memory and communicate in larger chunks over relatively slow networks.

So what happened? Multiple choice:
  • What do you mean "what killed it?" Supercomputers today are all massively parallel. Next time you do a Google search, thank a cluster.
  • Distributed computing killed it. If you want to really crunch a lot of data, get a bunch of people on the net to do it with their spare cycles, a la SETI@home and GIMPs.
  • Moore's law killed it. Most people don't need more than one or two processors because they're so darn fast. Sure you can use parallel techniques if you really need to, but most people don't need to.
Personally, I'd go with "all of the above" (but then, I wrote the quiz).

Another worthwhile question is "What's the difference between parallel and distributed anyway?" The definitions I gave above are more than a bit weasely. What's "relatively small"? What's the difference between a few dozen computers running GIMPs on the web and a few-dozen-node Beowulf? At any given time, the Beowulf ought to be faster, due to higher bandwidth and lower latency, but today's virtual Beowulf ought to be as fast as a real one from N years ago.

A distinction I didn't mention above is that classic parallel algorithms have all the nodes running basically the same code, while nodes in distributed systems specialize (client-server being the most popular case). From that point of view the architecture of the code is more important than the hardware running it. And that's probably about right.

No comments: