Month: February 2021

Understanding the Real Issue using Root Cause Analysis

Posted on Updated on

Too often people, including Consultants, spend time trying to solve the wrong problem due to having incomplete or incorrect information. Once, I was investigating a series of performance problems and unplanned outages that were assumed to be two separate problems. As I gathered information, several people provided anecdotal stories of anomalous behaviors in various systems, speculation about the “real problem,” and discussions about “chasing ghosts” during previous attempts to resolve the problem.

Photo by Ryan Miguel Capili on Pexels.com

I remember stating that I was there to solve a real problem having a serious negative impact on production and that it was not my intent to chase ghosts or do anything else that would unnecessarily waste time. Next, I outlined the approach I would use to make a Root Cause determination and that we would reconvene to discuss the real problem and potential solutions. A few people scoffed and felt this was a waste of time and money.

The process followed was simple, structured, and logical. It took everything that was known to be true and mapped it out. I looked for patterns, commonalities, and intersections of systems and events. Within two days, my team and I had identified a complex root cause involving multiple components, which we demonstrated would reliably reproduce the symptoms that our client was experiencing. From there, we worked with their teams to make minor network changes, system configuration changes, and several small application changes.

By the end of the second week, they were no longer experiencing major slowdowns or unplanned outages. Each outage cost this company tens of thousands of dollars in lost sales due to the time-sensitive nature of their product. Within one week, they had recovered the cost of hiring me and my team. What stuck with us was how many really smart people “believed in ghosts” and failed to focus on the information they already had.

A few years later, we created a white paper to potentially help others needing a simple structured approach. Below is a link to that white paper written by one of the top people on my team. We received very positive feedback then, so it seemed that this could still be useful today. Please take a look and let me know what you think.