Why Organizations Leave Money on the Table

Kevin R. Strader

Don’t Leave Money on the Table!

Why do organizations leave money on the table by not investigating failures that cost them money? One would venture to say that all manufacturing companies have failures each year that cut into their profit. The prevailing question is: What do you do when that failure occurs? Do you simply fix the equipment, get back up and running, and return to whatever you were working on at the time? Or, do you stop what you are doing and diligently try to understand why the failure occurred and put measures in place to prevent recurrence? Is the culture at your facility one that seeks to understand why something failed or is it in a mode where you need to get back up and running as fast as possible? How about your commercial team and management external to your facility? Is there perceived pressure and a lack of understanding that have driven your organization to a place where failures are not fully understood?

Consider this scenario. Suppose you were working on your taxes on April 14th. In order to take full advantage of every deduction, you wanted to make sure you accounted for all charitable contributions. You log in to your bank account to get a record of these contributions and to your horror, you notice that $10,000 has been wired out of your account without your consent. What are you going to do? Are you going to continue doing your taxes or are you going to stop, call the fraud department and get them busy looking into the problem? A likely guess is the latter would occur and you probably would spend some time investigating it yourself. Granted, you might return to your taxes in order to get them done by April 15th to avoid penalties. However, once completing your taxes, you would probably return to the issue of the $10,000. You would probably stay in contact with the bank regarding what happened and how to prevent it from happening again.

This is exactly what workers are not doing with failures that are costing their companies money when they do not fully investigate and seek to understand them. They are not stopping what they are doing to investigate these failures and determine the physical, human and systemic root causes. Why are they doing this? Why hasn’t anyone articulated the importance of this issue to the organization and the value of learning from its failures and preventing recurrence?

Suppose for a minute you did nothing about the lost $10,000. What do you think would happen a few months down the road? You guessed it. The criminal would come back and steal another $10,000. That is exactly what happens at facilities when they don’t fully investigate production failures. When you do not eliminate the defects from your system by getting to the systemic causes, you allow a similar failure to occur later on down the road.

So, what should be done when a failure occurs?

First, you must preserve evidence. Evidence is key to any investigation. Without evidence, you do not have an investigation.

Second, you must study the evidence. If you are responsible for investigating a failure, it is imperative that you follow up expediently to study evidence. It is not right to ask operations or maintenance to preserve evidence if you are not prompt at studying it.

Third, you must do your best to understand the physical root cause before putting the equipment back in service. This is hard, as there is always pressure to get the equipment back up and running. This means the culture of the organization must be one where folks are prompt at looking at the failed equipment. You must have a sense of urgency around analyzing the evidence, thinking about possibilities as to how the equipment failed and ruling these in or out based on the evidence you see. Once you have a good idea of the physical root cause, then you need to do your best at not reintroducing this defect back into the equipment when putting it back together. You also need to have a management philosophy, whereas if you are prompt at responding to a failure, then the organization will give you the breathing room to dig into the issue to prevent recurrence. This usually equates to a few extra hours…not days.

Fourth, you must convene a team to investigate the failure. Conducting failure investigations with just one person is just plain sloppy. Conducting an investigation with just one person is basically pencil whipping the investigation to satisfy a requirement and not taking it seriously. You cannot properly investigate a failure by simply relying on the reliability engineer to do it alone. The team should have at least an operations representative, a maintenance representative and a reliability engineer.

Fifth, you need to use a process for conducting the investigation. Using a fault tree and 5 Whys is usually sufficient. Again, evidence drives the investigation. Asking “why” or “how” and then using evidence to either rule in or rule out possibilities is a practical way to conduct the investigation.

Sixth, you must identify the three types of root causes: physical, human and systemic. Investigations often stop at physical root causes. Why? Because it is easy to stop there. Physical root causes identify what flaw caused the particular failure. However, simply identifying this cause does not necessarily eliminate future failures from occurring.

You must identify the human root cause: what someone did to introduce the flaw into the system. This is a hard one since no one wants to place blame on a coworker. That is why it is imperative that you not stop there. Most people do not show up for work to do a bad job. You must understand why this individual introduced a flaw into the equipment. Understanding this leads to the final and most important type of root cause.

You must identify the systemic root cause. This cause answers the question as to why an individual made the decision he or she made. Identifying this root cause and putting mitigating actions in place will not only prevent failures from occurring in the equipment being investigated, but it will also prevent future failures from occurring in other equipment. Identifying this root cause has far-reaching positive consequences.

Many companies have become serious about eliminating safety events, whether personal or process. They have done a great job in understanding these events and putting systems in place to eliminate future events from occurring. Due to this dedication, most industries are much safer.

It is time to have this same dedication about reliability. It is time to start learning from production losses to prevent future failures from ever occurring. In doing so, companies can become even more profitable through increased reliability.

Just remember: If it were your money that was lost, how would you respond?

Don’t leave money on the table!

Kevin R. Strader

Kevin R. Strader, PE, CMRP, CRL, is a Global Petrochemicals Availability Engineer for BP. He is responsible for helping sites improve turnaround (TAR) performance and reliability. This is accomplished through the implementation of systems and processes, as well as ongoing coaching and training in the areas of TAR planning, execution and reliability improvements.

How to Fix the 70/30 Phenomenon

When you ask front line supervisors or team leaders if all people in their teams are performing to the same standards or if some are doing more work and achieving more results than others, you will often get the same answer. All over the world, the most common answer, after some analysis, verifies that about 30% of the people do 70% of the work.

Zen and the Art of Managing Maintenance

Unfettered expression and spiritual satisfaction? How does this relate to managing a maintenance department, especially one in the U.S. Postal Service? Open your mind. Take a page from the Zen Buddhist monks who preach: When you are quiet and listen, you become aware of sounds not normally heard. USPS maintenance leaders are listening and beginning to understand that maintenance success doesn't come through closed minds and closed doors.

Why do maintenance improvement initiatives fail to deliver? (Hedgehog or Fox?)

It is not uncommon that many reliability and maintenance improvement initiatives fail to deliver expected results. Why is it so? Some of the most common causes I have observed include:

Why Maintenance Improvement Efforts Fail

Why do improvement efforts fail or perhaps not sustain the gains? There are many reasons, but those most often stated are “lack of commitment” and not “following the process”. But why is there lack of commitment, and why aren’t processes followed? Here are a few of the reasons that I’ve seen:

TPM and RCM: Whirled Class

When a piece of production machinery broke down at the Whirlpool plant in Findlay, Ohio, several years back, it was accepted practice for the machine operator to call maintenance and then sit back and wait for the problem to be fixed. Critical information and knowledge was not shared between the operator and maintenance technician. Like many companies, these workers were stuck in traditional roles - operators run the machines, maintenance fixes the machines, and the two do not cross. As a result, productivity opportunities were missed.

Where Do Maintenance Professionals Come From?

Many managers are unaware that best-in-class companies routinely design-out maintenance at the inception of a project. That, clearly, is the first key to highest equipment reliability and plant profitability. Whenever maintenance events occur as time goes on, the real industry leaders see every one of these events as an opportunity to upgrade. Indeed, upgrading is the second key, and upgrading is the job of highly trained, well-organized, knowledgeable reliability professionals.

TPM and Tecate: The New Translation

The true translation — might it be proper to say a new and improved translation? — is being used today by Cervecería Cuauhtemoc Moctezuma, one of the largest brewers of beer in Latin America. Known throughout this company as Mantenimiento Alto Desempeño (MAD), or translated as High-Performance Maintenance, the concept of TPM is alive and well at the company's six plants in Mexico. Perhaps the best example is at CCM's brewery in Tecate, located a short drive from the U.S.-Mexico border on the Baja California peninsula.

Why Organizations Leave Money on the Table