Photo by Emily Mortimer

Doing root cause analysis the right way

Leadership Jan 3, 2022

The concept of 5 whys is frequently trotted out as a formula for root cause analysis. Its proponents claim, “Just ask ‘why’ 5 times, and you will fully understand the root cause to a problem you are having!” The same concept is frequently applied to the process of researching and planning new solutions that your business can offer the market.

In case you aren’t familiar with the approach, the idea is to dig into surface-level problems to make sure actually understand the root causes. Asking “why?” 5 times as you dig deeper for each is the core of the method. Suppose our revenue is down 10% for the quarter:

  1. Why? -> Because we lost 10% of our customers
  2. Why? -> Because our customer service level suffered
  3. Why? -> Because we don’t have enough people to answer the phone quickly
  4. Why? -> Because our customer service agents are leaving for other opportunities
  5. Why? -> Because our pay and benefits aren’t attractive enough

While it is important to dig into root causes, anybody who has spent time with small children knows two things about asking questions like this: 1) there can be a lot more than 5 layers of “why?” underneath any given question, and 2) the answers sometimes stop being useful long before you run out of questions to ask.

There may be companies where a 5 whys methodology may hit the sweet spot for root cause analysis, but many organizations likely need something that is more thorough. At the same time, other companies might need something that allows for less thoroughness, with a more dynamic approach instead.

The 5 whys famously originated with the manufacturing process at Toyota. Since I don’t know anything about auto manufacturing, I can only assume that it was one of the components of their success. For the rest of us though, it is worth making sure we don’t just blindly apply an arbitrary number of layers when we peel back a problem onion.

An effective root cause analysis is performed with an awareness of how much value will follow the investigation and resolution.

When the NTSB investigates major airline incidents, do you think that the investigators show up to a crash site and then stop asking questions once they reach 5? No, of course not. These incidents usually have multiple causes of varying impact, and each of those causes likely had several points along the way where the failure could have been avoided. Because safety receives such a focus in aviation, there will almost always be multiple takeaways that will be applied throughout the industry.

For a lean startup with all hands on deck, spending several weeks to understand how to optimally implement a minor feature might not be worth the effort. If the team can quickly get to a high level of confidence that a planned solution is the right one, or at least that it isn’t completely wrong, then it is advantageous to just implement that solution and move on. This is especially true if it will only require a small amount of effort later to change or reimplement your approach.

It can be helpful to think of the thoroughness of root cause analysis as a spectrum. The investigation of a passenger jet crash with multiple fatalities is at one extreme. Multiple root causes will likely be found over multiple months by a large investigative team, with many important takeaways that will help save lives in the future. At the other extreme are tiny startups where the risk of moving too slowly holds much more downside than the risk of missing a few minor underlying details.

To understand your sweet spot on this spectrum, you need a sense of a few things:

  • How complex are the root causes or improvements likely to be? Are there tens or hundreds of interconnected systems, and multiple teams involved? Or, is it a simple standalone system managed by a single person?
  • What is the impact if the failure happens again, or if an improvement is implemented incorrectly? Will a small percentage of your customers have a minor inconvenience, or will the safety of other people be put at risk?
  • What is the effort associated with investigating, finding, and then resolving the root causes or implementing the decided-upon improvements?

Knowing these answers can help guide how much effort to spend when questions come up about your business, or about specific products or services you offer.

When failures happen, as they always do, good leaders will have the right information to make the call on how much root cause analysis to perform. Good leaders are involved enough to have a good sense of the answers to the questions above, and can quickly consult with team members to fill in or validate their understanding.

When working on how to improve your offering to your customers, those closest to the problem you are trying to solve will know how to answer those questions listed above. Those individuals can then help guide an effective approach between overanalyzing, or not doing enough research. Too far one way and you won’t move quickly enough, take it too far the other direction and you will see wasted time and effort building the wrong thing.

If a single “why” answers a question well enough that you can then move on to more important things, then only ask one why.

If you actually need to understand the multiple branching paths along a complex and interconnected system of cause and effect, then make sure to actually investigate it all rather than just stopping at 5 whys.

Ultimately, the question of how many whys to ask or how deep to dig when investigating for a root cause is a case of balancing opportunity cost. Is a large-scale investigation or a lot of research the best way to spend your time, or should your limited resources be better spent elsewhere?

There are a lot of “whys” floating past us every day. Understanding which ones to pick out and how far to take them is a mark of a good leader and a productive company, and that understanding may very well be your difference for success.

Tags