In tech, we love to think our decisions are purely rational. However, bugs highlight flaws in our great constructs. Teams, even those committed to quality, can react emotionally to these flaws, leading to prolonged or unresolved issues. This inhibits team learning and affects customers. Recognizing and understanding what makes us anxious, powerless, or overwhelmed during bug-fixing aids in addressing and overcoming them.
I have seen three main dysfunctions hold back tech teams:
- We weigh the cost of bugs against new features
- We dismiss ill-defined bugs too quickly
- We’re slow to decide on changes
We weigh the cost of bugs against new features
Every software product is flawed. While this is true, the teams’ ability to work together to craft a cohesive, well-made product, can bring us close to perfection–a product that truly meets users’ needs and does so elegantly. But because we don’t want to spend our life fixing bugs, we see them as an engineering trade-off between value (a combination of a severity and a priority score) against cost (cost of detection, cost of investigation, cost of fixing, etc.). This perspective is too narrow.
A broader view reveals that this trade-off doesn’t make sense:
- “Perfection” is a moving target, as customer behaviors and platforms (infrastructure, browsers, hardware) will evolve over time faster than we can evolve internally
- The true business impact of a bug is often unpredictable. Some can be severe, damaging reputation or prompting executive intervention. Others accumulate over time, subtly pushing users to competitors.
The real benefit of working on bugs is showing the team what their misconceptions are: “I thought the web browser was behaving this way when in fact it behaves in another way”, “I thought this Kubernetes pod would properly shut down our process when in fact it sends a SIGKILL”.
Prioritizing quality over new features turns these unknown costs into concrete opportunities for growth, fostering better customer care and honing our craft. By becoming better craftspeople, we can start focusing on the real engineering trade-offs: the ones that directly affect our customers’ experience.
We dismiss ill-defined bugs too quickly
Bugs are deviations from user expectations. Given these expectations can be vague, defining bugs becomes debatable. Yet, instead of arguing if a user's issue is a system dysfunction, the real question is: among the people on the value chain, who should learn what?
Recently, I joined discussions over a tech support ticket. A customer faced difficulties inviting employees via SSO. Initially deemed a feature request, the problem lingered until a senior software engineer intervened two months later. By inquiring about the client's identity provider, the engineer linked them to a relevant knowledge base, resolving the issue.
Who should learn what?
- Tech support should learn all the intricacies of connecting the product to identity providers so they can troubleshoot when something goes wrong. By understanding this, they would be more apt to classify requests that are out of the ordinary and hand them over to the right team
- Software engineers should better communicate with the rest of the company the important part of setting up an identity provider with the product they build
- Customer onboarding teams should understand what went wrong in the onboarding of this specific client: how come the customer was unable to use the product even though they were onboarded through an enterprise sales B2B process?
Each bug report is like a mineral ore: the valuable learnings–the mineral–hide within. Bugs are initially ill-defined before people refine them, understand them, and address them. The key to overcoming this challenge is to 1) develop routing systems so that bugs reach the right people, 2) establish direct ownership and communication with the customer (whenever possible) by these people, and 3) improve the capacity for everyone along the chain to solve the problem at their level (cf the anti-pattern “we fix bugs, but we don’t solve them” in my previous piece).
We’re slow to decide on changes
In one instance, while addressing a bug with a senior front-end engineer, we required a minor user interface modification: displaying a “last updated” date on a section of the screen. The engineer chose to consult with both the product manager and the designer before making this change. Despite seeming trivial, these discussions usually took more time than anticipated.
Due to everyone's hectic schedules and numerous tasks, we've established bureaucracies that require everyone's approval for each change. This indicates a lack of real collaborative space in the team for collective problem-solving. In this particular instance, the engineer waited until the next daily meeting to discuss the issue with the product manager, and they jointly approved the change.
It's unfortunate that engineers often defer most decisions to product managers. I frequently hear product managers express frustration that tech leads show little initiative, leaving them with no choice but to simply 'feed' them the next set of features. This practice seems far from the agile ideal of collective intelligence.
Lean tools and principles can assist in creating collaborative spaces that enhance team autonomy:
- One piece flow—working on one change at a time—is amazing to create laser focus across the team.
- True kanban (not the agile “work in progress limitation” horror) creates team spaces to discuss and solve problems in real-time.
- An obeya develops everybody’s understanding of what the customers want and the conception trade-offs we are making (see this article on how product manager Surya uses the obeya).
In the ever-growing world of tech products and services, bugs are not merely errors but valuable lessons to be learned. By facing our fears and embracing growth opportunities, we not only refine our products but also enhance the skills and cohesion of our teams.