Vivek Haldar

Automating software engineering

Nicholas Carr was kind enough to respond to my post about automation as it applies to software engineers:

Software writers themselves don’t seem immune from the new de-skilling wave. His comments point to the essential tension that has always characterized technological de-skilling: the very real benefits of labor-saving technology come at the cost of a loss of human talent. The hard challenge is knowing where to draw the line—or just realizing that there is a line to be drawn.

The comment thread on his post has some great discussion. You should go read it. In it, Carr raises two more questions, which I will try to address here.

Climbing to higher levels of abstraction

re: “each advance opens up more new opportunities than it removes” This is a common defense of automation in pretty much all fields. (See the quote from Alfred North Whitehead in my Atlantic piece.) And it’s a good defense, as it’s very often true. But there are also a couple of counterarguments:

At some point, as the capabilities of automation software advance, the software aid begins to take over essential tasks – sensing, analysis, diagnosis, judgment making – and the human shifts to more routine functions such as input and monitoring. In other words, with the automation of skilled work there’s a point at which there are no “higher-level tasks” for the human to climb to.

In the world of software there is always a higher level to climb to. But an individual software engineer has to have both foresight and will to do it.

For example, not so long ago, we were building regular CRUD web applications by hand every time. Soon common patterns emerged, and were codified in web frameworks like Ruby On Rails and Django. The level of abstraction for coding web apps was lifted. The interval of interest is between web apps becoming common but before a large part of the boilerplate was extracted into frameworks. If you were a web app developer during that period, the repetitive nature of coding web apps should have made you feel that a large part of your current job was ripe for automation.

As in every arena where automation takes over, in software engineering too the pattern is familiar: as an activity becomes widespread, the repetitive and “non-thinking” parts of it are recognized and extracted. In the case of industry, they are extracted into a machine or robot. In the case of software, they are extracted into a layer of abstraction or framework. There is an uncanny valley just before that happens when workers get the feeling that they’re just going through the motions.

The alphas are those who extract common repetitiveness into automation. The betas are those who raise their skills to adding value on top of the newly automated tasks. Everyone else gets left behind.

Cutting across layers of abstraction

Lower-level tasks may be seen as mere drudge work by the experienced expert, but they can actually be essential to the development of rich expertise by a person learning the trade. Automation can, in other words, benefit the master, but harm the apprentice (as R. Carey suggests in the preceding comment). And in some cases even the master begins to experience skill loss by not practicing the “lower level” tasks. (This has been seen among veteran pilots depending on autopilot systems, for example.)

I think this is a real danger. Teaching new students high-level languages is great to get them started, but for for true systems-building one needs to understand the whole stack.

The thing about computer science is that all abstractions are leaky. We can’t use our abstractions without understanding their underlying implementation and limits. This happens at every layer. An integer shows its implementation as a 32-bit word on the microprocessor when adding to MAXINT wraps around. A garbage collector in a VM frees us from manual memory management, but shows up when the app experiences unpredictable pauses. Even the CPU, the bedrock abstraction, leaks its implementation details when things like speculative execution and branch prediction affect performance. The examples are endless.

The other thing is: as Bryan Cantrill points out, failing and pathological systems are the ones which truly teach us. They lay bare our tower of abstractions. One has to understand all the layers and their implementations to debug them.

So becoming comfortable and knowledgeable in just one layer is brittle, and that brittleness is exposed the moment one has to solve or debug a significant problem.

For the apprentice, this means learning and working with as many different layers of abstraction as possible. For the more experienced, it means not letting the day-to-day job in one layer make you rusty with the others. These are both challenges.

Peeking into the future of automating software engineering

The peculiar thing about computing and software is that it can recursively coil around to build upon itself. Large amounts of code are used as data to gain insight into the process of writing and changing software. Sophisticated algorithms can learn from this code/data to automatically generate changes and fix bugs in existing software. And the fledgling field of computational creativity can ingest the mountains of now digitized human creative output to guide the “creativity” of an algorithm. A paper from the 80s was prescient when it claimed that software processes are software too.

So, more computing power and more sophisticated algorithms and more data allow you to build even more powerful computers and even more sophisticated algorithms. This is unlike physical industry. A tall building doesn’t help in building even taller buildings.

It is in this line of thinking–connecting the individual programmer to a datacenter’s worth of data, analysis and computational power–that I think the future of automating the process of writing software lies. My hope is that the combination of human and machine will be more potent than either alone, much like advanced chess.