The case against local LLMs
As a geek, I love the idea of running LLMs locally, and owning my entire AI stack without depending on cloud APIs. I run LLMs locally with Ollama all the time. I’d love to see a world where the intelligence that these models provide is democratized, cheap and plentiful.
But if I think about it objectively there are several reasons local LLMs going mainstream might not happen.
Widely used utilities want to be centralized
In the early days of electricity, factories generated their own electricity on-site. This phase was relatively short-lived, giving way to the centralized power grid we have now. The same thing happened with water, trash disposal, manufactured goods and pretty much every staple of modern life.
The Web and its underlying protocols were also motivated by the idea of enabling decentralized control and ownership. But the Web of today is centralized to a few large providers, with billions of users.
There will always be geeks who will host and serve their own email and website, and it is a beautiful thing that that is possible. The protocols are open for anyone to participate in, and web and email servers are open-source and freely available. But the vast majority of people will want shrink-wrapped solutions.
So just looking at the historical trend of how similar technologies have tended to end up with centralized structures (even if they could be decentralized), it’s not looking good for the local-LLMs argument.
Which brings me to my second point…
Centralized management is about labor, not control
The desire to decentralize is often motivated by having control– control over that particular component (electricity, a server, a model etc). But the more important factor is labor.
Owning and controlling a thing takes constant labor. Sure, you could run and maintain your own email and web server. Are you ready for the care and feeding of the surrounding infrastructure? The electricity, the network, the storage? What about configuration, keeping up with updates and security patches? There’s a reason operating the physical and software aspects of running a datacenter is a full-time job.
The same argument applies to local LLMs. It’s rarely about just the LLM. You also need all the surrounding infrastructure, both physical (hardware, network etc) and software (what is the app layer around the LLM?).
The question for you is whether you want to spend your labor on all that, or something else. What is the opportunity cost?
About efficiency and cost
I haven’t seen any rigorous studies making the comparison, but I’d be very surprised if from an efficiency and cost point of view local LLMs beat out a large datacenter that aggregates demand and has economies of scale.
An anecdotal data point: the M3 Ultra Mac Studio with 512 GB of RAM, the current cream of the crop for running large-ish LLMs locally, costs $8549, or $712 per month for 12 months (before taxes). I bet if you sent the collected works of Shakespeare back and forth to the latest and greatest LLM API all day every day for a year, your API bill would be a fraction of that.
Value moves up the stack, away from utilities
If you’re building a mobile or web app, the valuable and differentiated part is the user experience, domain knowledge and the problem being solved. The database or web framework you used is a non-differentiated commodity. It can be swapped out relatively easily.
The same goes for apps built around LLMs. They’re like the jet engines in an airliner: important and necessary, but only a small part of the entire experience and value.
Hotrodders build beautiful cars in their garage, and have a great time doing it. I do want a world where local LLMs are available to the geeks and builders and experimenters to play and build with. I just don’t think local will be the mainstream way they are consumed.