Wednesday, February 10, 2010

Pandora's division of labor

A while ago Roku added Pandora to its selection of channels and a shorter while ago I got around to trying it out. I like it, though I don't listen to it all day long (I generally don't listen to anything all day long).

Pandora's main feature is its ability to find music "like" a particular song or artist you select. This is nice not only because it will turn up the familiar music you had in mind, but it will most likely also turn up unfamiliar music that you'll like. As I understand it, that's a major part of its business model. Record labels use Pandora to expose music that people otherwise wouldn't have heard, and Pandora takes a cut.

To that end, it will only allow you to skip so many songs in a given time (though there is at least one way to sneak around this). They pick out likely songs for you and they would like you to listen. You can, however, tell Pandora that you like or dislike a particular selection. Pandora will adapt its choices accordingly.

So how does it work? Pandora is based on the Music Genome Project, which is a nicely balanced blend of
  • Human beings listening to music and characterizing each piece on a few hundred scales of 1 to 10 (more precisely, 1 to 5 in increments of 0.5).
  • Computers blithely crunching through these numbers to find pieces close to what you like but not close to things you don't like.
This approach is very much in the spirit of "dumb is smarter". Rather than try to write a computer program that will analyze music and use some finely-tuned algorithm to decide what sounds like what, have the software use one of the simplest approaches that could possibly work and leave it to humans to figure out what things sound like.

Even the human angle has been set up to favor perception over judgement. The human judge is not asked to decide whether a given song is electroclash or minimalist techno, but rather to rate to what degree it features attributes like "acoustic guitar pickin'", "aggressive drumming", a "driving shuffle beat", "dub influences", "use of dissonant harmonies", "use of sitar" and so forth. There are refinements, of course, such as using different lists of attributes within broad categories such as rock and pop, jazz or classical, but the attributes themselves are designed to be as objective as possible.

This combination of human input and a very un-human data crunching algorithm is a powerful pattern. Search engines are one example, Music Genome is another, and if there are two there are surely more. In fact, here's another: the "People who bought this also bought ... " feature on retail sites.

No comments: