Echoes and real voices


I recently finished the latest novel from Richard Powers called The Echo Maker. It was one of the finest books I’ve read in years. Powers is by leaps my favorite writer. His books are poems trapped in the novel form. The craft with words every bit as compelling as the stories they tell. I’m a big fan.

But there’s something different about this novel. It is still superbly crafted for sure, but the narrative engine revs louder. There was something about it. Something I couldn’t quite identify. As I was reading the book a fellow Powers fan friend of mine alerted me to an article in the NYT (login required), written by Powers, about how he composed the novel using only speech recognition software on a tablet computer. That was it, I thought. This must be the stylistic difference. A novel voice-crafted versus hand-crafted.

Except. The more I think about it, the more I can’t believe it. I work for IBM, a company deeply committed to speech recognition, text-to-speech, and machine translation. It is hairy, complex computing — bordering on AI. Personally I’ve been working in Arabic-English translation since 2000 and I know just how thorny the problems are in getting good recognition. I simply can’t believe an author as talented as Powers could create a book as linguistically complex as The Echo Maker using speech reco alone.

I’m not saying he’s lying. I’ve had some interaction with Powers, all positive. He kindly responds to e-mail, for one. And yet, there’s precedent for this tale of novel-by-dictation being fiction too. In 2002 at the Chicago Humanities Festival Powers delivered a talk called “Literary Devices” about an ELIZA-like machine that sucked him into an e-mail conversation that was as real as any human author’s output. I bought it. Most bought it. We bought wrong. The story itself was fiction — which only made it better. Humans falling for a story about a machine that tells stories indistinguishable from human stories. Amazing.

So, I guess I’m asking this. Mr. Powers, did you really dictate this whole novel? Or should we nestle comfortably in what is admittedly a damn good story even if you didn’t? All half-dozen readers of this blog are dying to know. And if we can’t tell your response from a computer impersonator we’ll obviously consider the dialogue valid. Do tell!

UPDATE: Powers replies. Wow. More on this in a bit …

See also the follow-up post Thamus (partially) vindicated.

3 Responses to “Echoes and real voices”

  1. donturn says :

    I’m reading The Echo Maker now too and as I’m reading it I’m thinking about the voice rec stuff, which is getting in the way of enjoying all the text. I keep wondering if Powers had to dictate with formatting information such as “he glanced around the room period and said to KS quote you mean italic me end italic end quote. That would be tedious indeed.
    I could imagine doing an outline or rough draft with voice rec, but not editing and formatting.

  2. Richard Powers says :

    It’s not called “lying,” John! It’s called “creative non-fiction!”
    Many thanks for the good words about my new novel. They mean a great deal to me. *The Echo Maker* is the first novel that I’ve written entirely by dictation, so it’s great to hear that the change in process may have contributed to producing a change in the reading experience as well.
    The absolute truth of the situation is: I created the book entirely on a Motion LE1600, using the tablet’s built-in noise-canceling microphone array. There’s no keyboard on this tablet slate, and I haven’t had one connected to it since purchasing the machine. That said, I have to concede your point that speech recognition is still very much a bleeding-edge, exploratory experience. While many, many sentences come out perfectly intact, some of them still are zingers. But the combination of stylus and speech is far more fluid and powerful for me than a keyboard ever was. Two taps are usually all it takes to fix a misheard word. Occasionally, I highlight a phrase and speak again. When a word isn’t in the speech lexicon, I handwrite it in. Using handwriting, I can easily change individual letters faster than it would take me to navigate and correct with the arrow keys, backspace, and letter keys. On rare occasions when it’s needed, I can also drop into spelling mode and speak the spelling of a word out loud. As a last resort, for short acronyms, foreign words, etc., I can peck things in with the on-screen keyboard. (So technically, I have “typed” very small portions of the text, although certainly only a fraction of one percent of the total document.)
    Donturn—it never occurred to me that publishing my article might make certain readers subvocalize punctuation! Here’s hoping that that consciousness goes away and the story comes back. From the production end, it’s remarkable how quickly you normalize the mechanics involved in speaking out the few punctuations a text requires. It simply becomes second nature – part of the process of thought. Just remember how much more tedious and aberrant it is to have to type in your phrases and long thoughts, one letter at a time. As far as John’s reaction of incredulity: I think the act of typing out a full novel manuscript is much more deserving of disbelief!
    Yours in speech (and speakos),

  3. noreen says :

    The Motion LE1600 is great, but what software does Poweres use? Dragon?