Jun 08, 2023

Re: Why I don't use Copilot

Ed Summers wrote a post on why he doesn’t use Copilot, and I want to riff on that to see if it helps me build my own mental scaffolding for how LLMs, and AI in general, will change programming, and how programmers should adapt to that. This isn’t meant to be a critique of the piece, but my attempt to come up with counterpoints to see what actually holds.

I wrote up some of my initial thinking on this topic in a blog post titled LLMs are compilers, which should tip you to which side of the debate I’m on. I don’t think this genie (or Pandora, depending on how you look at it) is going back into the box, so now it’s mostly a question of how best to use it as a tool.

OK, let’s go:

GitHub Copilot is a technology that is designed to help you write code, kind of like your partner in pair programming. But, you know, it’s not an actual person. It’s “A.I.”–whatever that means.

In principle this sounds like it might actually be a good thing right? I know several people, who I respect, that use it as part of their daily work. I’ve heard smart people say AI coding assistants like Copilot will democratize programming, by making it possible for more people to write code, and automate the drudgery out of their lives. I’m not convinced.

Full disclosure: I’ve been a paying customer of Copilot for a few months and have been using it for my own miscellaneous scripts, mostly Python and Emacs Lisp. (Note with big blinking red lights: this is for personal stuff only. Don’t use Copilot or products like it inside your company unless they have an explicit policy allowing it!) Speaking only for myself, I get orders of magnitude more value from it than the $10/month it costs. How? Copilot (often used in conjunction with GPT, and now, Bard) has reduced the activation energy for the small scripty projects I want to undertake from being just slightly over a weekend (which meant they would get abandoned or not even started) to fitting comfortably within a weekend. And that has made a world of difference, at least for me. I suspect the same story will play out across programmers in general, where it will serve to lower the friction and frustration of programming just enough to increase their overall output.

Here’s why I don’t use Copilot (or ChatGPT) to write code:

Copilot’s suggestions are based on a corpus of open source code in Github, but the suggestions do not mention where the code came from, and what the license is. GitHub is stealing and selling intellectual property.

This is the strongest argument against Copilot–that they’re license-laundering code. This is also the reason that most (all the ones I know of, certainly) big or medium-tech companies outright ban the use of Copilot and similar products for in-corp use. The IP risks are simply not worth it. Until the legal aspect of Copilot is tested in court and some clear precedent is established, that is not likely to change.

But there might be a chicken-and-egg problem here: large corporations that produce code are steering clear of using Copilot, so there is unlikely to be a major lawsuit that would test this in courts. The biggest of big-tech (FANG companies) have internal codebases large enough that they can train LLMs purely on their own codebase to offer completions to their own engineers in-corp, and never worry about getting sued for license infringement. Maybe the best chance of a test case reaching the courts is a small startup using Copilot to develop a product that becomes a huge hit, at which point it would be large enough (for someone) to sue.

Copilot lets you write code faster. I don’t think more code is a good thing. The more code there is, the more code there is to maintain. Minimalism in features is usually a good thing too. Less really is more.

This argument has the same shape as Charles Petzold’s classic Does Visual Studio rot the mind?, which was written back in 2005 and lamented the rapid onset of IntelliSense as a basic tool of programmers everywhere. I am sympathetic to this argument, but I’m not sure this is going to persuade programmers who are in a rush to put a product out into the market. If a tool let’s them ship 1% faster, they will use it.

There is a whole different argument around what type of coding needs IntelliSense (and now Copilot) vs what can be done bareback, but the “minimalism is beautiful” argument depends on your dependencies being minimal and beautiful. When you’re using a library with thousands of classes, each with hundreds of methods, that argument falls apart.

As more and more programmers use Copilot it creates conservativism in languages and frameworks that prevents people from creating and learning new ways of doing things. Collectively, we get even more stuck in our ways and biases. Some of the biases encoded into LLMs are things that we are actively trying to change.

Another strong argument. Copilot sets up a positive feedback loop that greatly reinforces current languages, libraries and frameworks. Doing ML-heavy work in Python? Copilot is so stunningly accurate in that domain (no doubt due to the mountains of code that use pandas and scikit and pytorch and tensorflow) that if a newer, better ML library comes along, one of it’s greatest barriers to adoption will be programmers reticent to use it because Copilot doesn’t offer good completions for it yet. Same goes for new languages. A chicken-and-egg problem again.

I predict that in order to get traction on new libraries or languages their developers will have to build additional tooling that can somehow “inject” good completions for that particular new entrant.

Developers become dependent on Copilot for intellectual work. Actually, maybe addicted is a better word here. The same could be (and was) said about the effect of search engines on software development work (e.g. Googling error messages). But the difference is search results need to be interpreted, and the resulting web pages have important context that you often need to understand. This is work that Copilot optimizes away and truncates our knowledge in the process.

The comparison with addiction is one that Petzold makes in the above-linked piece. The argument here is that Copilot is letting programmers get away with a shallower understanding and lesser knowledge. But the exact same argument can be made of every advance that has continously raised the level of abstraction of programming over the decades. Assembly language obviated the need to understand binary programming. Compilers obviated the need to understand assembly language. We’re at the point where one could be perfectly adept at accomplishing tasks in Python without actually understanding how a CPU works. The march of progress continues.

Will there need to be an “elite” class of programmers that build all this infrastructure? Someone has to understand the full stack enough to build OS kernels, compilers, language runtimes, garbage collectors and ML libraries. We’ll be in trouble if that class dwindles too much. C is not hip, it’s old and tired, but every major OS kernel is still written in C. For the millions of programmers that pick up Python with YouTube tutorials (and that is a wonderful thing!), we still need a few dozen to get drawn into the Linux kernel source and get fascinated by C and pointer chasing, to preserve our ability to manufacture the concrete used to build our digital metropolis. Hopefully things like Copilot lower the barrier enough on the Python end of the pipeline that we still get the trickle we need on the C end.

Copilot costs money. It doesn’t cost tons of money (for a professional person in the USA) but it could be significant for some. Who does it privilege? Also, it could change (see point 4). Remember who owns this thing.

True. $10/month when translated to most places in Asia would be significant enough to keep this out of the reach of most young programmers. With the advent of open-source LLMs that continue to get more capable, hopefully we’ll see this price drop significantly, or even soon reach the point that a fairly capable model can be run entirely locally.

How much energy does it take to run Copilot as millions of developers outsource their intellectual work to its LLM infrastructure? Is this massive centralization and enclosure really progress in computing? Or is it a step backwards as we try to reduce our energy use as a species?

I’m an unabashed accelerationist, so I’d argue this is besides the point. But even if you buy into the premise that we need to reduce our energy consumption, one could argue that centralization leads to efficiencies of scale, and for all we know it might actually be more energy-efficient to consolidate computation happening on millions of clients and bring them into a datacenter.

What does Copilot see of the code in your editor? Does it use your code as context for the prompt? What does it store, and remember, and give to others? Somebody has probably looked into this, but if they have it is always up for revision. Just out of principle I don’t want my editor sending my code somewhere else without me intentionally doing it.

Very fair point and I completely agree. More transparency would go a long way towards more developers becoming comfortable with technology like this. A local log of everything sent to the server would be great.

Working with others who use Google Copilot makes my job harder, since they sometimes don’t really understand the details of why the code is written a particular way. Over time Copilot code can mix idioms, styles and approaches, in ways that the developer doesn’t really understand or even recognize. This makes maintenance harder.

This depends on how things like Copilot embed into the overall development flow. Do you have linters or static checkers enforcing basic formatting and style? Do you have a local style or best practices guide? Do you have a robust code review culture? Yes, if you never look at completions from Copilot with a critical eye and accept blindly everything it suggests, you’re likely to end up with a crappy codebase. But you will also end up with a crappy codebase if you take the first draft of your code and check it in without any code review.

As far as I can tell the only redeeming qualities of Copilot are:

Copilot encourages you to articulate and describe a problem as written prose before starting to write code. You don’t need Copilot for this. Maybe keep a work journal or write a design document? Maybe use your issue tracker? Use text to communicate with other people.

Copilot is more interactive than a rubber duck. But, it turns out Actual People are even more interactive and surprising. Reach out to other professionals and make some friends. Go to workshops and conferences.

Copilot makes me think critically about machine learning technology, my profession and its place in the world.

Could not agree more! Sharpening your writing skills, discussing problems with colleagues and looking critically at the new shiny thing— these are all things every programmer should be doing.

Using not just Copilot but an LLM as a “smart” rubber duck that you converse with can be an intellectually stimulating and effective way to iterate towards a solution, and even get a deeper understanding of the tradeoffs involved. A couple of great examples of this mode of programming were written up by Prof. Crista Lopes in a series of blog posts.

Maybe my thinking on this will change. But I doubt it. I’m on the older side for a software developer, and (hopefully) will retire some day. Maybe people like me are on the way out, and writing code with Copilot and ChatGPT is the future. I really hope not.

But some good news: you can still uninstall it–from your computer, and from your life.

I share some of Summers’ discombobulation. The day after GPT-4 was demo’ed I wrote:

A couple of years ago I thought I had a decent grasp of the trajectory of programming as a field, and as a career path. Languages, frameworks and libraries would get better at modeling and abstracting more complex programs, safety vs performance would be less and less of a conflicting tradeoff, compilers would get better, IDEs and tools would get better, and underneath it all, even if Moore’s Law was hitting a wall, we’d just continue throwing more cores at larger problems. But at the end of the day, the immense cognitive leverage that comes from encoding something into executable software would ensure that programmers were always in short supply. But now I feel unsure and disoriented, not certain about where I stand.

Potters make art, but they must love the physical feeling of clay. Code is our clay, and we don’t know what it’ll feel like in a year or two.