Mar 18, 2010

Gabriel vs Brooks

In a recent essay, Richard Gabriel takes issue with Frederick Brooks’ view on conceptual integrity.

Brooks says:

“The central problem… is to get conceptual integrity in the design itself… If we look back then at the 19th century and the things that happened – the cartwright and the textile machinery, Stephenson (the train), Brunel's bridges and railway, Edison, Ford, the Wright brothers, etc – these were very largely the designs of single designers or, in the case of the Wright brothers, pairs.”

Gabriel claims that Brooks gives too much credit to the heroic notion of a single mind creating beautiful designs with conceptual integrity, while underestimating the role played by a cast of “sometimes hidden co-authors and the thing designed itself.”

In a humorous and affectionate ad-hominem twist to his arguments, Gabriel says of Brooks “anything sounds 10 times more true when spoken by someone with a southern accent. As Brooks spoke – mostly off the cuff – and paced slowly away from the podium and back, I fell under his spell.” This was ironic, because to me, Gabriel’s extensive use of evocative classical themes – from poetic criticism, the life of a famous renaissance Italian architect to Greek mythology – almost forced me to nod along in agreement throughout the essay. At the end of it, however, I was left with the same feeling I have when I’ve just seen a magic trick. I know there’s a hole in there somewhere, but I can’t for the life of me figure it out. It was too smooth. It was “too true to be good” – which, ironically, is another phrase Gabriel uses to describe Brooks’ arguments.

Gabriel attacks Brooks on two fronts: collective authorship, and intrinsic design constraints.

Authorship, Gabriel explains, is never a solitary activity.

“I’ve never published anything in a serious venue without some friend or colleague—and usually several—having read and commented on it. I take the comments seriously, and there are several friends whose recommendations I adopt essentially without question. The typical nonfiction book is full of acknowledgements that indicate a co-author-type relationship, and that’s true for my writings. And even when an artist seems to be creating something completely new—artists like daVinci, van Gogh, or Beckett— it usually turns out there are hidden, unconscious influences even they can’t recognize.”

The idea of authorship or invention being a solitary credit has been widely challenged. One of the chapters in Scott Berkun’s “The Myths of Innovation” is “The Myth of the Lone Inventor”. Says Berkun:

“Despite the myths, innovations rarely involve someone working alone, and never in history has an invention been made without reusing ideas from the past. For all our chronocentric glee, our newest ideas have historic roots…”

I could not agree more, and the argument carries over to code too. The code I write is highly influenced by code I’ve read, and the code around it. With code review, the collectiveness of code authorship becomes even more apparent. It does take a village.

The more subtle point has to do with design constraints imposed by the nature of the very thing being designed. Considering two poems about the story of Eurydice from Greek mythology, he finds a common underlying structure.

“In both Eurydice poems— written at different times by very different poets—we are witnessing a known structure with different decorations, with different points of view, with different “lessons,” with different aesthetics, but with an underlying—shall we call it—conceptual integrity…
…the Eurydice story forms a frame that directs where and how the poet can extend and use it. Builtin understandings and explanations, and already defined moods and images come to mind when the story is retold, and these form surfaces against which the meanings, images, moods, thoughts, and emotions of the new material bounce and reflect. The underlying story imposes a strong sense of what would be in keeping with it. And approaching the Eurydice story from any direction, the story itself does its own refracting.”

Indeed, we see this in almost every significant system we use. Using data-parallelism on a large scale forces one to come up with a model similar to MapReduce. Storing huge amounts of data in a distributed system compels a designer to sacrifice ACID and end up with something like BigTable, or Dynamo. There are intrinsic constraints on how any large system can be built (for example, see the CAP theorem).

It looks like I agree with all of Gabriel’s arguments refuting Brooks’ single-mind theory of conceptual integrity. Yet, I still cannot swallow his final conclusion in its entirety.

I’ve seen too much code whose conceptual integrity has been destroyed by multiple authors and changes over time. The author of a change often only cares about the narrow aspect of the system he is modifying without grasping its larger architecture. This “true architecture” is often not even documented, but lives only in the head of those who wrote significant parts of the entire system. Gradually the original integrity corrodes. Entropy takes its toll. Business logic creeps into the presentation layer. A single concept ends up forking code all over the place.

This is when the poem analogy starts breaking down. A poem — even one where multiple authors collaborate and revise and edit — is declared “done” when it’s done. It is not subjected to an endless stream of revisions. Code is never “done”. There might exist discrete points in time when its frozen, but as long as it is in use, it will be fixed and patched and moulded towards slightly different uses. A poem’s creation may be a dynamic process, but the end result, the poem itself, is static. The process of creating code shares a lot with the process of writing poetry and literature, but the end product is very different.

Here you do need a single person, or a small committee, for oversight. Most successful open source projects are run this way. Linus delegates the vast majority of real development on the Linux kernel, but still acts as gatekeeper for what makes it in and what doesn’t. Guido acts in a similar capacity for Python. Matz for Ruby. Hamano for Git. The list goes on and on. Why do we need this one person? To maintain the integrity of the project. To make sure it is true to its goals. To ensure a certain bar for quality.

Gabriel’s argument suffers from two major flaws. He gets carried away by the analogy with poems. And he mashes together two distinct concepts – authorship and integrity.

Here is what the dictionary says:

authorship (ô'thər-shĭp’) n.The act, fact, or occupation of writing.
Source or origin, as of a book or idea.

integrity (ĭn-tĕg'rĭ-tē) n.Steadfast adherence to a strict moral or ethical code.
The state of being unimpaired; soundness.
The quality or condition of being whole or undivided; completeness.

Authorship is the process by which something is produced. Integrity is a quality of the thing produced. Gabriel speaks of the two in the same vein, and while the two do have deep interactions, in the end they stand separately. He makes arguments that apply to authorship, but they do not carry over to ensuring the integrity of the piece thus authored.

The nominal “author” of a piece is the one not hidden by the mists of time, an oversimplified reduction that eases discussion and hides the complex and tangled history of the ideas within – yes. But for the piece to have integrity, that same author has to break from the stream that has brought him this far, and make a stand all his own.