Metadata and spring cleaning

It’s taken years, but I finally have backup where I want it. (Oh yes, dear readers! It’s another post about data backup. Recline your chair and prepare for a mind-blowing post.)

My reasons for backing up seem to be changing. Certainly there’s still the raw, precious data. Family photos and video, certain media projects — these things have to be saved for posterity. But increasingly my reasons for wanting a backup are more about state than data. I want to return to the state my machine was in more dearly than I want the data it once contained.

Think of it in more concrete terms. What if you everything in your house — every single physical item — had a double in a storage facility? What if every time you bought anything you bought two and put one in a self-storage bin? Then your house burns down (and all your loved ones are safely vacationing on Maui). You can reconstruct your life from the storage facility, but it will be a massive pain in the ass. The state is all fooked. The effort involved in getting it back to a livable order is overwhelming, basically the same thing as moving — an act which ranks just slightly below death of a spouse in terms of personal stress.

Ideally you’d want a legion of robotic moving specialists to reconstruct your house according to the old plans and place everything back as it once was. A bonus would be the option to redirect the robots as the spirit moved you, but at the very least you’d have an automatic replica. This is the source of my fascination with bootable clones.

The fact is, most of my data is replicable. My iPod and laptop all contain enough of my music library to reconstruct it if the main machine should fail. My calendar is online. Personal mail’s all IMAPped up to the Great Google in the Sky. Work e-mail, replicated from servers. Photos are all on Flickr; video at any number of services. The set of truly precious, non-online-dwelling data is getting smaller and smaller by the day. Basically source files only.

My prediction is that in the near future state is all we will care about. You won’t even think about data being local or remote. But you will care about speed-to-recovery. And that’s all about the little things, how your machine behaves, how your kitchen was organized before the fire.

There are corollary effects of this attitude. Recently to alleviate some of the space pressure of five people in a home we decided to clean up some of the impromptu areas of storage in the house that had persisted since we moved in. You know what I’m talking about. Boxes that never got completely unpacked. Stacks of crap that made do in a guest bedroom only because you didn’t know where else to put it.

I undertook the foolish exercise of building an attic in my garage. I can hardly hammer a nail straight much less build a structurally sound platform. Most of what we moved up there was non-essential: books, college notebooks (wanted to throw away but couldn’t — I’m going to need that Intro to Lit Crit some day,damnit!), winemaking equipment, random crap.


So I got it all up there. Stored. Except it wasn’t really stored — wasn’t really backed-up — unless I knew it was there. And this again is the influence of Google. Unless you can search for it, unless you know precisely how to get it back, you might as well throw it out, delete it. So I took photos of everything up there, where it lay in the attic. And for the books, well, it got a bit geekier as I finally finished cataloging and noting the location of every last volume with the superb Delicious Library.

Do I care about most of that crap up there? No. Do I care that I know the state of that crap. Absolutely. And that’s the thing. If the garage burned down I would be OK. The stuff is replaceable. The index to that data is not.

Maybe I’m overthinking this because my father and brother are in the self-storage business. But I think not. Spring cleaning for me is really spring tagging.