Production in manufacturing is fraught with breakdowns. The goal of maintenance is to minimize the impacts, which include production losses, poor quality, increased cost and unsafe working conditions, to name a few. Maintenance does this by understanding why breakdowns occur and how best to mitigate their effect now and in the future. Root cause analysis (RCA) is a tool preferred by many maintenance professionals to gain the necessary understanding. RCA can put an organization on the path of continuous improvement. At Ford Motor Company, root cause analysis and the “4-D process” are synonymous terms.
Ford’s MOS (3×3) Process
Ford takes a traditional approach to maintenance: maintenance supports production. Basic inputs include people, machines and materials. The maintenance processes are captured in a 3×3 matrix geared around planned maintenance, proactive maintenance and unplanned maintenance with associated process metrics. Our goal is to increase throughput to potential. We have seven key unifying processes that impact and overlap our maintenance activities and link them to other operating systems. Part of continuous improvement is the feedback loop that is required to get better. Our focus today is on unplanned maintenance, specifically root cause analysis. As you can see, other essential components to the root cause analysis process are reaction plans and sharing of information.
Root Cause Analysis Defined
A search of the internet returns many interesting definitions for root cause analysis. The findings can be split into two groups. The first group is solely concerned with fixing the current problem at face value and returning the automation back to its pre-existing condition. They look at what is broke and repair or replace failed components. Essentially, they settle for solving the symptom and getting production running again. This group is doomed to repeat history without seeing improvements. The second group seeks the seed from which the problem grew. This seed is the root cause or causal factor that affects the event’s outcome. When they understand how the failure came about, they also gain the knowledge needed to prevent it from happening again. Over time this leads to continued improvement and is the key to a true root cause analysis.
The 4-D process evolved from Ford’s 8-D process. The “D” stands for disciplines. The 8-D is defined as a problem-solving methodology for product and process correction and improvement. It is primarily used to address customer defects, understand deficiencies to regulatory requirements, etc. Our maintenance teams also used it in the past, but the resource commitment and access to data across organizations and sites became overwhelming. We shortened it specifically for maintenance to five disciplines and integrated it as part of our maintenance database. It is called a 4-D because the fourth discipline is “define and verify the root cause.” The purpose of the 4-D process is to add structure, document the process and improve throughput to potential. Everything we do in maintenance always comes back to that one goal.
4-D Process Flow
The 4-D process flow can be broken down into three phases: determining when to do a 4-D based on known failures, what is the process for conducting a 4-D, and how and with whom should I share the 4-D findings.
Root Cause Analysis (4-D)
The process starts with either a production breakdown or inspection anomaly. Both cases get documented in the maintenance database in the form or a work order. Upon addressing the immediate issue and closing the work order a decision must be made whether further analysis is required or not. We use a standard set of five questions that will be reviewed later. Upon determining a root cause analysis is required, we use the existing work order to create a new document called the 4-D. All the data is automatically transferred to the new form and the process begins. During the process there are ongoing discussions internally and with the Manufacturing Engineering (ME) organization to ascertain whether the problem impacts other like machines, future programs, preventive maintenance inspection criteria, etc.
Documenting Failure Modes
We believe all breakdowns should be documented. This creates a data set of known failures that assist in Failure Mode and Effects Analysis at multiple levels in the organization. That being said, we also understand capturing all breakdowns is not always practical and can create an administrative burden on the plants depending upon the maturity of their maintenance processes. Consequently, we have set minimum requirements by organization. Over time, as we get better, the minimum requirements are tightened. For instance, in machining departments the minimum started at four hours, then two hours and is now 1 hour.
Creating an Emergency Work Order
Documenting maintenance activities provides the data source for analysis and establishes a path to increased productivity. In order to capture information, it is important to make the procedure simple, convenient and accurate, then it becomes an embedded part of the culture. Find the means to auto-populate as much data as practical so operators can limit their input to describing the problem and quickly get the work order into the hands of maintenance.
Creating Follow-up Work Orders/4-D
Skilled trades complete work orders and provide inputs that determine the next steps. Did they finish the work, is there follow-up action required, or did they put a band aide in place to get production running? Their role in the data collection process is significantly more important since they can and must detail their actions to get production up and running. Again, simplifying the follow-up process is crucial to getting people to use it. What are the next steps? Close the issue? Create a follow-up work order? Create a 4-D for root cause analysis?
As you can see from the questions above, not all emergency work orders will result in a 4-D. We have established a set of five questions. If the answer to any one of them is “yes,” then a 4-D is required. The questions are: 1) Is the root cause of the event unknown? 2) Has this event or similar event occurred previously? 3) Should the event have been prevented or predicted by our maintenance activities? 4) Is the event something that should be shared with other teams and plants? 5) Is this related to paint abatement? A sixth question is under consideration: Is this related to any governmentally regulated process? These types of issues already fall under the 8-D process and typically are wider in scope than dealing with maintenance concerns.
Closing an EM Work Order
We talked about making the process simple. When closing an emergency (EM) work order, a pop-up with the five questions above appears. If a user selects any one of these, then the work order closes and simultaneously creates a 4-D pre-populated with the contents of the parent EM work order. Users also have the option to select “none apply,” and then the work order closes per established practices.
4-D Work Streams
Ford has three work streams for creating a 4-D/root cause analysis. The first and most common work stream is through the emergency work order process as previously described. This is the most common because the event was unplanned and unexpected, and we lacked the appropriate knowledge to prevent it in the first place. The second is to use a parallel process for other work types like preventive or predictive inspections. These work-order types characteristically result in planned maintenance but may still have an element of the unknown. The least used is the one where management makes the decision to directly create a 4-D for something previously documented or undocumented. Again, this is the least common because a root cause analysis revolves around failures/breakdowns.
D1 – Establish Team/Champion
Putting together the right team with the appropriate resources to conduct a root cause analysis determines the success or failure of the process. The team should have a champion to break down barriers, a leader to spearhead the efforts on working the analysis, and a diverse membership with appropriate technical skills. As a rule of thumb, the membership would include an operator, skilled trades and a process engineer. One person does not make a team, nor does it necessarily provide an unbiased investigation. Too many people can stymie the process, generally making it hard to reach an agreement. Limit the size to four to 10 people with complementary skills.
D2 – Describe the Problem
The problem description should detail what is wrong by answering what, when and where. It should also quantify the size of the problem and the business impact. These details are important in prioritizing the workload and when sharing the potential impact on like operations. It is imperative during this process not to try to solve the problem or make the leap to a conclusion. This will only lead to biasing the outcome. If assumptions are included here or anywhere else in the process, identify them as such so the team recognizes the difference and non-participants in the process don’t later confuse them with facts.
D3 – Interim Containment Actions (ICAs)
An ICA is a temporary stop-gap intended to protect your customers and return automation to a production state as quickly as possible. Any and all of these actions must be verified so they don’t trigger undesirable or unintended effects. In some instances, a risk analysis and continuous monitoring become integral elements of an ICA until the permanent corrective action is available. This is especially critical if you are now producing parts by an off-standard process. Quick ICA solutions are natural byproducts of a manufacturing environment but don’t jeopardize a quality solution for the sake of expedience. Too often we see interim solutions become the permanent solution due to a plethora of other problems to deal with, such as laziness, forgetfulness, etc. These usually come back with a vengeance until resolved correctly.
D4 – Root Cause Analysis
The RCA is all about finding the seed or causal factor that produced the problem. RCAs do not always require data, but data removes subjectivity. Keep in mind data comes in many forms: human input, software history files, software I/O, equipment age, manufacturer R&M data, metallurgical analysis, etc. The idea is to find the right inputs that point to the root cause. Do not underestimate the value of input from operators. They may not know the technical aspects of a failure. Through their six senses they probably had clues the failure was pending and just did not realize it. This is one example of what we would call an escape point, where the failure could or should have been detected and contained. More often than not, our automation sends audio, visual or other signals before it breaks.
The actual analysis is done using a variety of tools. The most common tool is the “5 Why” process. The intent is to keep asking why, and it may be more than five times, until the base causal factor is determined. Other tools Ford commonly uses are from the Six Sigma tool kit. The data or lack of data will dictate the best tool to use for the analysis. We also recommend incorporating a failure modes and effects analysis (FMEA) combined with a reliability-centered maintenance (RCM) approach to the analysis, as it will apply to the permanent corrective action (PCA).
4-D Report of Tracking Progress
The 4-D is well under way. A team has been put together, the problem described, ICAs are in place and now the actual analysis has begun. In order to track the assignments and progress, we created a report in our maintenance data management system called the 4-D status report. It contains links to the originating record (the EM that started the process), links to each assignment (follow-up work orders), and the 4-D record. The report can be scheduled at any set frequency to be delivered via email to team members or management. The report shows open and completed 4-Ds, and includes start and due dates. We realistically expect 4-Ds to be completed well within the 120-day timeframe we have set. After 120 days, it is flagged so management can understand the reasons for the lengthy investigation.
D5 – Permanent Corrective Actions (PCAs)
Now that the root cause is known, it is time to determine the best action to eliminate/minimize the effect/consequence of that cause without creating undesirable or unintended effects. The preferred solution is an engineered solution that requires no monitoring. Sometimes this is not possible or practical, so we start looking at alternative solutions. The next best solution is using a predictive tool to forecast impending failures, allowing maintenance to plan for the event and schedule actions. If we cannot predict failure, then our next look is to prevent failure.
This activity often encompasses creating or editing a PM to specifically look for the failure mode or a sign the root cause is in play. Lastly, in some instances, the business case exists to allow the event to run to failure. This usually only applies to low impact failures. Regardless of the solution, the PCA should include a holistic view of the total environment, which includes reviewing and modifying the necessary systems, policies, practices and procedures to prevent a recurrence. Also, document all lessons learned including the problem, cause and remedy. The final step is to communicate findings to the appropriate organizations and sites so they don’t have to learn the same hard lessons.
What Does a 4-D Look Like at Ford?
Ford has taken Maximo and added data tables to include D1-D5 so the RCAs become part of the maintenance history for our assets. All fields must be completed before a 4-D can be closed. The team makes the final determination on whether the content of the RCA should be shared. It then selects the appropriate distribution list based on a pre-configured dropdown list.
Global 4-D Status
In 2016, the 4-D process became the official RCA process for maintenance-related failures. Last year, Ford plants conducted nearly 15,000 root cause analyses. These helped contribute to year-over-year throughput improvements. Of these 15,000 records, 1,500 or 1 percent were communicated to the ME organizations. Not all have led to changes, but many have resulted in improvements in equipment reliability, better preventive maintenance tasks, increased standardization and so on. The total 4-D process is still evolving. It is leading to increased involvement by the ME organizations in plant maintenance activities, changes in how inspection tasks are defined and distributed, and better communications.
4-D Process Flow – Improving Communications
The 4-D process flow is dynamic, and we expect to make changes this year to get ME involved earlier in the process to gain the benefit of their expertise. We will incorporate the use of Email Listener in Maximo so ME has input into the 4-D process from the root cause analysis onward. Some of our organizations already indirectly do this and have achieved numerous benefits.
This article was previously published in the Reliable Plant 2017 Conference Proceedings.
By Gordon Van Dusen, Ford Motor Company