Understanding Goodhart's Law [Math Mondays]
One of the most important organizational principles that you should know about
Hey, it’s your favorite cult leader here 🐱👤
Mondays are dedicated to theoretical concepts. We’ll cover ideas in Computer Science💻💻, math, software engineering, and much more. Use these days to strengthen your grip on the fundamentals and 10x your skills 🚀🚀.
To get access to all the articles and support my crippling chocolate milk addiction, consider subscribing if you haven’t already!
p.s. you can learn more about the paid plan here.
Goodhart’s Law is one of the most important principles for builders and leaders. When designing systems, we often use incentives to nudge behaviors towards certain ways. However, we often see that these incentives skew behaviors in ways that are unexpected and often counter-productive. Goodhart’s Law covers that. In today’s piece on Tech Made Simple we will be covering what Goodhart’s Law states, why it’s so common, and how to account for it when building systems.
What is Goodhart's Law?
Goodhart's Law is often summarized as: "When a measure becomes a target, it ceases to be a good measure."
Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes
In practice, this law looks like this [true story told me by a sales leader at a Big Tech firm]
We identify a desirable end goal (more revenue).
We can’t ultimately control this end goal, so we pick a proxy metric that is in our goal: the number of users added to the pipeline at by the end of the financial quarter. This is based on a simple assumption: more users in the pipeline—>more revenue.
We tell the sales reps about this and reward them for it.
Selling is really hard. BUT giving prospective customers massive discounts to use the product is easy.
Sales Rep gives massive discounts to inflate numbers. The product loses money on these customers (not that this sales rep cares, they’ve made a huge bonus and moved teams).
Here fixation on one end goal, users added to the pipeline, ended up hurting the overall revenue of a product. Another common example involves AI Teams that often get so obsessed with minimizing error that they forget to account for other, less-sexy metrics (generalization, ROI, etc). Or Software Teams overengineering to hit performance metrics instead of building with the customer in mind.
Why Does Goodhart's Law Happen?
Complexity: Real-world systems are complex. Any single metric only captures a tiny part of the whole picture. As we’ve already covered: no matter, how extensive, any given set of data points is still an incomplete representation of reality. There is always going to be information that slips through the cracks, and nuances that are missed. This is why it’s important to not become a slave to the SOPs and always approach and evaluate the systems with your own judgment. Albert Camus (one of my fav writers) articulates some of this really well in his writings about the ‘Absurd’.
Human Ingenuity: People are clever. When incentivized to hit a target, they find creative ways to do so, sometimes in ways that counteract the original goals. LinkedIn decided to reward engaging with other people’s comments. People have been using ChatGPT to mass-reply to comments, completely ruining the experience for everyone.
Evolutionary Pressures: Over-focusing on a specific metric warps decision-making, encouraging choices that boost the metric, even if harmful in the long run. On a somewhat related note, this is also what makes training ‘AGI’ so difficult: any sufficiently powerful system will history its environment in ways that were not present in the training data. This causes these same agents to break down. One of the most impressive things about biological intelligence (even in very simple beings) is their ability to adapt to changing circumstances w/o skipping a beat.
Goodhart’s Law is deceptively simple to understand. But as with many other simple ideas, it is also very overlooked by business leaders all over. Here is another example, where a CEO fired his customer service team to replace them with AI. He justified this decision by claiming that the company customer support now had better performance. The excellent article, “How NOT to apply Artificial Intelligence in your business”, had a great analysis of why this was wrong
“First of all, he clearly states what were his drivers: the metrics he objectively measures and he thinks show he has made progress with the decision made.
Time to first response
Resolution time
Customer support costs
All of those metrics are important. But do you see the pattern there? Those are Customer Support metrics, not Customer Experience ones! All of them are inward facing (looking at the organization), instead of outward facing (looking at the customer). Yes, collecting Customer Experience data is harder than collecting Customer Support data. But it is indispensable, or you might end up shooting yourself in the foot.”
In the above case, it’s not surprising to see how a focus on call resolution time in customer service may lead to premature call endings and generic AI support rather than solving underlying problems.
There is also the ever-present problem of Tech and Blitzscaling: Tech Startups prioritizing growth over sustainability to hike up valuations as quickly as possible.
Anywhere you look, you will case studies on organizations getting too excited about cutting off more wood and ultimately sawing off the branch they were sitting on.
So how can we deal with Goodhart’s Law? Let’s end on a discussion on some ways we can design our systems/processes to account for it.
Accounting for Goodhart's Law in System Design
Multiple metrics: Never rely on a single number. Track a variety of metrics and qualitative insights (too many people underrate the power of qualitative metrics) to create a balanced perspective.
Tie metrics back to the ultimate goal: One of the reasons I write extensively on alignment b/w teams and excessive communication is so that everyone always has the north star in mind, even if they are working on something on something else short-term. This allows your engineers to design with the end in mind and improves the employee evaluation appraisal process.
Expect adaptation: Assume people will try to 'game' the system. Be prepared to adjust metrics or add new ones as needed. Don’t get too caught up in thinking about how people are supposed to interact with the system to overlook how they are interacting with it.
The fact that Goodhart’s Law has transcended Economics and gone somewhat mainstream is a testament to how ubiquitous it is in our lives. It’s a good reminder to take a harder look in the mirror and evaluate our processes with a magnifying glass.
Have you encountered Goodhart's Law in your work? Share your stories below!
That is it for this piece. I appreciate your time. As always, if you’re interested in working with me or checking out my other work, my links will be at the end of this email/post. If you found value in this write-up, I would appreciate you sharing it with more people. It is word-of-mouth referrals like yours that help me grow.
Save the time, energy, and money you would burn by going through all those videos, courses, products, and ‘coaches’ and easily find all your needs met in one place at ‘Tech Made Simple’! Stay ahead of the curve in AI, software engineering, and the tech industry with expert insights, tips, and resources. 20% off for new subscribers by clicking this link. Subscribe now and simplify your tech journey!
Using this discount will drop the prices-
800 INR (10 USD) → 640 INR (8 USD) per Month
8000 INR (100 USD) → 6400INR (80 USD) per year (533 INR /month)
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Small Snippets about Tech, AI and Machine Learning over here
AI Newsletter- https://artificialintelligencemadesimple.substack.com/
My grandma’s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
I remember two cases:
One of Amazon where their automatic resume filter almost always suggested males. Because the training data had a lot of males. The model picked up the spurious correlation.
One is when YouTube recommended more peadophilic videos because of a botched metric. It lead to an echo chamber.
Glad you wrote about this. I’ve been thinking about it since xkcd wrote a comic about it last week.