Evolving my music genome
So, iTunes Genius feature, it’s just you and me. Face-to-face. Gloves off. You think you know what I like? OK, you get one track to prove yourself.
No, no, that’s not fair. I’ll give you something really juicy to crunch on. How about you take a playlist that I described a while back as My Music Genome, the very seed (in my human algorithm-based estimation) of the majority of what I listen to now? Musical eugenics.
Oh, you don’t make playlists from other playlists? Only single tracks? Sucks. Fine, let me do this one-by-one. 12 tracks in the list; 25 recommendations per track. Let’s start being genius … Go!
Wait, what’s that? You can only identify 10 of 12 songs in my genome? You’re telling me that you have never heard of Orbital’s Impact or Vapourspace’s magnum opus? You have the Orbital track in your music store, for god’s sake!
OK, fine, go for it with the remaining 10. I’ll wait.
- Going Under – Devo
- The Robots – Kraftwerk
- This Wreckage – Gary Numan
- Squance – Plaid
- Halo – Depeche Mode
- Jericho – The Prodigy
- C/Pach – Autechre
- Stigmata – Ministry
- Aquarius – Boards of Canada
- Phantasm – Biosphere
Gravitational Arch of 10 – Vapourspace Impact (The Earth Is Burning) – Orbital
Cool, 10 new playlists. Let me open them right up. 250 tracks. Subtract the “source” tracks, that gives me 240 songs that you think spring from my base musical tastes. Interesting.
There are plenty of ways I could slice this data — Last.fm tags, AllMusic moods, BPM, waveforms — and I just might. But right now what jumps out at me are the duplicates. That is, the recommendations that come from two or more “source” songs from my genome. This might mean something.
The duplicates are important because they narrow the tree back down. They’re the inbred family members, points where multiple threads of interest converge. (In the image above, Hyped-Up Plus Tax by Dabrye, for instance, is a recommendation generated from both Plaid’s Squance and Kraftwerk’s The Robots.)
The overlaps are few, but meaningful.
- Aftermath – Tricky
- Children Talking – AFX
- Chime – Orbital
- The Curse of Ka’zar – Lemon Jelly
- Dominator [Joey Beltram Mix] – Human Resource
- A Forest [Tree Mix] – The Cure
- Future Proof – Massive Attack
- Gone Forever – Ulrich Schnauss
- Hyped-Up Plus Tax – Dabrye
- Laughable Butane Bob – AFX
- Little Fluffy Clouds – The Orb
- Me? I Disconnect From You – Gary Numan + Tubeway Army
- Mindphaser – Front Line Assembly
- Monkey Gone to Heaven – Pixies
- Paris – MSTRKRFT
- Satellite Anthem Icarus (apocryphal) – Boards of Canada
- Stars – Ulrich Schnauss
- We Are Glass – Gary Numan
A few identifiable strains emerge from this new “evolved” playlist. (These characteristics don’t necessarily reflect the dominant style of the artists themselves, just the tracks, which is more precise anyway.)
- shoe-gazy, downtempo: The Cure, Ulrich Schnauss, Boards of Canada, Dabrye
- hard-edged: AFX (aka Aphex Twin), Human Resource, Front Line Assembly, Pixies, MSTRKKRFT
- genre-benders: Tricky, Lemon Jelly, The Orb
(Not sure where Gary Numan fits in that typology, but he deserves to be in every list as far as I am concerned.)
Wow, John, that’s amazing, you’re thinking. You’ve managed to waste countless hours compiling data to tell yourself that you like soft music, hard music, and music that mixes the two. Such insight!
Actually it is interesting because the artists in this new playlist are some of my most-played. Lemon Jelly, Ulrich Schnauss, and Boards of Canada have been on heavy rotation for years. Clearly they are the fruit of stylistic seeds planted long ago. And now we have something approaching empirical proof. Truth is, most of what I listen to is either ambient or hard-edged or some outlying miscegenation. And there’s plenty of music that doesn’t fall into those categories.
The most interesting data point is that Satellite Anthem Icarus by Boards of Canada is the song that the iTunes Genius most thinks defines my music listening. It is part of multiple playlists generated from the source playlist.
Satellite Anthem Icarus – Boards of Canada
But here’s the crazy thing. That particular track is a fake. It is not the actual Boards of Canada track by the same name. It was included in a partially-bogus torrent download just prior to the official album being released. But I did actually fall in love with it. It is one of my favorite of their tracks. Except that it isn’t theirs. (Full story of this odd situation here.)
So, according to iTunes, the song that most represents the evolution of my musical taste is one that it should by all rights not even know about.
Now this gets to the heart of the mystery surrounding the Genius functionality itself. What exactly is it doing? It recommended this fake song to me which is neither named precisely what the real track is (in my library I have “(apocryphal)” in the title) nor is it the same length. And if by some crazy chance Apple is doing waveform analysis, it sounds nothing like the real version. So how could Genius recommend something that’s iTMS obviously doesn’t have in its library? Related, why would Genius not recognize the Orbital track in my library when I renamed it precisely as it is named in iTMS?
UPDATE: Commenter Pedro helpfully notes that this “fake” is actually Up the Coast by Freescha. Which makes this whole experiment really interesting. I agree with Apple that this song is extremely emblematic of my distilled music tastes, yet as noted above none of the metadata I had would have informed Apple to that. Is it possible that Apple is actually doing music analysis in the manner of Amazon’s text analysis? I really can’t believe that if for no other reason than that the initial Genius scan (when you run 8.0 for the first time) would take forever, which it did not. Still I want to believe. This is the way recommendations should happen.
I really don’t know how the recommendations are being generated, but I do think it is based on something more than store purchase data. Consider the jump from Ministry’s Stigmata to TMBG’s Ana Ng.
Stigmata – Ministry
Ana Ng – They Might Be Giants
There’s pretty much nothing similar between industrial music and irony-laden pop. But these two songs are definitely related when you consider their respective “hooks”: both use heavily-produced, effected, and clipped guitar noises as their main musical trope. Coincidence? Maybe, but why else would they be connected? Not music store data, methinks. Obviously Apple’s exact algorithm is a secret, but I’d love to know more.
Some procedural notes. It helped that I already had a short playlist of stuff I considered influential. (Though I find it a lamentable shortcoming that Genius can’t generate a playlist from a playlist. It would have to infer commonality first then generate a new list. How tasty would that be?)
I then just set Genius to create a new playlist per track. Various recombinations of the playlists yielded a clean list which I flipped into a spreadsheet using the very handy Export Selected Song List AppleScript.
From the spreadsheet data I experimented with and aborted a bunch of different visualization ideas. At one point I had a monstrously large 10-headed Venn diagram in Illustrator that hurt to look at.
Eventually I created the network diagram in the screenshot at the top of this post using the wonderful Many Eyes social visualization site. (Yes, Many Eyes is IBM. Disclose that!)
A fuller, more interactive version of this visualization is available (Safari recommended, if you are on a Mac). Also the source data is there for the playing. I am sure there are other ways to massage it.
Enjoy this level of music nerdery? Dive into the Ascent Stage back catalog: