Is C/C++ worth it?
(Daniel Lemire asked the original question. Rather than leave a long comment, I decided to break this out into a post.)
When you ask if language or technology is worth it, the first thing you have to do is define what “worth” means to you. For the purposes of this post, I take that to be money, i.e. economic worth. In general, CS education does very little to teach us how economic factors play into technical decisions. The economics of programming has been a strong theme of this blog.
Let us start by asking the economic variables and commodities that should be considered when trying to answer this question. At the highest level, the economic variables that matter the most are:
- Material: this is the cost of hardware, real estate, electricity, bandwidth, and everything else that you explicitly need to buy in order to run your service or product. This is the easiest variable to account for, because you write checks and get stuff, and all the quantities and numbers involved are clear.
- People: this is the cost of hiring and maintaining the people necessary to build and maintain your service or product. This variable is more fuzzy than material, because there are many costs associated with hiring a person besides just their compensation, and these costs become known only in the aggregate. For example, a general rule of thumb is that the total cost to a company hiring a person is approximately double their salary.
- Time: the cost of time has two components–the monetary value of time spent by the people you have hired, and the monetary value of losing out in the market due to arriving too late or building the wrong product, i.e. opportunity cost. Due to the 2nd component, this is by far the most fuzzy and hard to account for variable.
The total cost of building and maintaining a product of service is the sum of these 3 costs. I’m going to assume that the intent of the original question was to decrease cost, while ignoring revenue and profit from the product.
I am by no means suggesting that these 3 are the only economic variables that should be considered, just that they form a crude model sufficient for the argument at hand. One could go on to list dozens of other variables, but most of them would fall into one of these 3 categories.
Now that we have set the stage, we can reframe the original question to “suppose we built a product using C or C++, and we know its cost to be X, if, in an alternate universe we built the exact same product using language L, would the cost be less than X?”
Note that this alternate universe does not have to be a parallel one, it could just be the universe starting at time=now, i.e. building the product over again with the new language.
An alternate way of framing the question would be: given the space of cost defined by these variables, in which regions of that space is language/technology X more economical than others?
Let me make this concrete by considering two common scenarios:
You are a startup: in this case, the cost of time dominates the other 2 variables. You are in a race to build something fast, release it, discover whether it works, and iterate. If you can use a technology or language that chews through material that lets you build and modify your product very quickly, you should choose that. This is a familiar story–and the reason why supple and expressive languages like Ruby and Python are so popular among startups.
You are a warehouse scale computing company: you run multiple large data centers, with gazillions of machines. You have a few products and services that are used by billions of users. At this scale, the cost of material starts becoming a significant component of your overall cost, and may even dominate the other two variables.
The difference between these two scenarios boils down to this: are you spending hardware on people, or people on hardware?
The major problem is that the crossover between the two is extremely fuzzy. It’s like navigating a ship across the Pacific with your instruments only telling you when you have crossed either Tropic. Chances are, one fine day the person who actually writes the checks will notice that, for example, they are spending nearly as much on material as on people, and will instruct the next line of engineering managers to crank the dial towards optimization.
The astute reader will notice that I still haven’t answered the question. The answer is… it depends.
The particular example that Lemire used is a batch job, where we would like to maximize throughput–process as many integers possible. Another important class of applications does not care so much about throughput, but about the latency of each individual request. VM randomness like GC pauses are a disadvantage for those.
VMs have their own memory overhead, which might not be significant when running on a few machines, but when multiplied by thousands of machines quickly becomes significant.
Is C/C++ worth it? Yes. For backend infrastructure that runs across entire datacenters, the truth is the economics simply do not work out with anything else. Of course, not everyone operates on that scale, and for them the answer will be different. How exactly C and C++ landed up in that position is an entirely different story. And the situation looks like it won’t change anytime soon, because while there has been an explosion of higher level languages in the last couple of decades, nothing has come close to dislodging C/C++ from the perch of systems/infrastructure supremacy.
One more thing: implicit in Lemire’s post is that the “low-levelness” of C/C++ is a substantial burden, and it would be a worthwhile tradeoff to use a higher-level level, more expressive language. In my anecdotal experience, the low-level nature of C/C++ has not been an issue with the right programmers. I’ve seen programmers around me who are productive in spades with C++, notwithstanding all the blogosphere’s chatter about how much it sucks to work with C++. I think the reason is that the bottleneck in system construction is often elsewhere. Again, that’s a whole other story.
So the answer is, yes, C/C++ worth it when it is.