Many organizations struggle with a high level of reactive uncertainty when it comes to asset reliability. This uncertainty affects the equipment availability to meet the customer demands. The loss of availability results in increased human and monetary costs, and often, safety or environmental compliance. When digging deeper to determine the root causes, you often find the MRO storeroom not functioning well, the PM program poorly designed at best without effective condition-based approaches, no maintenance planning, and a minimal weekly maintenance schedule if one exists.
Add to that limited partnerships with other stakeholders such as production who don’t make the equipment available for maintenance. Often production personnel lack standardized work practices themselves when operating the equipment and induce failures. When determining how to establish a solid foundation, the organization must move forward with the basics. There are eight steps to accomplish this foundation in no particular order.
Aligning the Organization for Asset Reliability
In aligning the organization, there are three questions to address.
- Which job positions?
- What are the role expectations?
- How is position performance measured?
Most organizations lack formal work processes that come in the form of graphical process workflows, RACI or RASI charts, and definitions documents. While these may seem trivial to many, and extra work with little return, the reality is that many organizations don’t have clearly defined roles and responsibilities. They may have job descriptions when hiring, but the activities are often different when in the field. Efforts overlap and are often duplicated in the name of getting the job done. The onboarding process for new team members takes longer than necessary. These work processes define the roles required as well as their responsibilities.
In most organizations, I look for the following positions along with the spans of control where applicable:
When looking at the spans of control, think regarding organizational size and maturity of the processes. The spans of control are guidelines, and the numbers are intended for multiples of a position. Meaning that if 55 technicians exist, then two to three planner schedulers are required. On process maturity, if the organization is new to or not robust in their maintenance planning and scheduling activities, err toward the lower number until the processes are established. For example, with a maintenance planner scheduler, use multiples of 20 technicians. In smaller organizations, don’t assume that you don’t need a role, since the number of technicians falls below the lower number on the span of control. You can easily justify a full-time planner scheduler for eight to 10 technicians.
The position of maintenance or reliability engineer is overlooked in smaller organizations. This position is all about continuous improvement. I encourage organizations to dedicate some level of resourcing to the function, even if it is a technician four hours per week looking at problems and equipment history to improve.
When you place people in positions, please don’t forget the requisite training and coaching that needs to occur. For example, we promote technicians to supervisors but never teach them how to lead. The number one reason for leaving a position is the supervisor or manager. The same goes for the planner scheduler.
Determining Your Maintenance Strategies
Most organizations have no or a very limited basis for their preventive or predictive maintenance program other than the OEM manual. The reality is that unless the OEM has field service technicians, many do not have the data to understand the failure modes of their equipment. The vendors that supply them don’t have it either. You should have the failure data from your CMMS history, but that is never shared with them typically. Oh wait, you don’t have it either, right? Do you have good failure data in your CMMS? Our experience from asking for the information when doing RCM2 or RCM3 and FMEAs across the globe show most don’t have the necessary details.
We find that organizations are doing too much time-based preventive maintenance. In many cases, those same organizations are issuing multiple PMs for a given period on the same asset, i.e., on a weekly or monthly basis. Many of these PMs are the result of knee-jerk reactions to past failures and are not from a root cause perspective. I have seen as many as 10 separate PMS for the same weekly PM frequency on the same machine. Sadly, most of these PM tasks (40-60 percent from RCM2 studies) fail to address any likely failure modes.
Most organizations in the top percentile of performance do only about 20 percent of their physical maintenance based on time. Moreover, intrusive maintenance introduces failure to an otherwise stable system at the rate of 70 percent or more. In his book, Don’t just fix it, improve it, Winston Ledet says that 84 percent of failures are due to poor work behaviors.
The essential point is that to be effective, there must be a basis such as reliability-centered maintenance (RCM2, RCM3, or FMEA) that couples proven methods along with your equipment experience to define a maintenance strategy. These asset reliability strategies will be a combination of condition-based, time-based and predictive technologies. As part of the analyses, you will also determine failure modes that must be addressed from the perspective of training, standardized procedures or re-engineering. Once these strategies are defined, they are implemented in the CMMS and triggered for execution.
When failures occur, a root cause (RCA or RCFA) process should attempt to identify the root causes. The maintenance tasks and strategies should be reviewed to ensure the likelihood of preventing or mitigating the consequences of failure. If changes are required, follow-through must occur to ensure asset reliability.
The MRO Storeroom
A vital component to ensuring effective work execution and improved asset reliability is a well-managed MRO storeroom or materials management process. The reality is that most storerooms are either models of excellence or very poorly executed. There does not seem to be much middle ground when it comes to an organization’s storeroom practices.
Materials should be identified and acquired in advance for planned work. These materials should be assembled in kits and staged in secure areas for the forthcoming work. The storeroom should have a PM program for spare rotating equipment. Processes should be continuously executed:
- First in, first out (FIFO) by using date-stamping practices to address shelf-life
- Obsolescence management with “where used” for all stored items
- Bills of materials in the CMMS
- Accurate nameplate data and item masters for both stock and non-stock items
- An effective storeroom layout based on ABC principles
- Adequate security and item transaction processes
- Minimum/maximum and safety stocking levels
When poorly managed, it is common to find storerooms with more than 50 percent of the materials being obsolete. Old removed (worn out) parts litter the storeroom shelves waiting for reuse only to find that they fail quickly when installing. Drive belts hanging from pegs on the wall are cracked and dry-rotten. Conditions like these are counterproductive to ensuring asset reliability. The storeroom is or becomes a cost burden instead of a profit center when poor practices exist.
Identifying and Prioritizing the Work
In the reactive organization, it is common to find that work, especially emergency or urgent work, remains undocumented in the CMMS. Some sites do better with requiring technicians to complete a work order to charge storeroom parts. From a best practices perspective, we look for 90 percent of all work to be planned and scheduled. In many reactive organizations, we find upward of 60-90 percent unplanned work.
Every hour spent planning the work saves three to five hours in execution. But to plan it, it needs to be identified in the CMMS. By identifying the work, regardless of whether it is corrective, emergency or urgent, the record provides equipment history. We understand how long an asset is down, the asset reliability, and where we are spending our maintenance dollars. It becomes equipment history for use by the maintenance or reliability engineer to utilize to improve asset reliability and reduce costs overall.
Rather than rushing to the tyranny of the urgent, we need to prioritize the work. To reach the 90 percent planned levels, we can’t treat everything as emergency or urgent work. You only have so many resources in each timeframe or shift. We must establish a priority matrix for execution. I was teaching a maintenance planner scheduler course for a large facility. During the time there, I would spend a few hours each day outside of class on a smaller scale effort to educate the production supervisors.
I asked a group of 10 production supervisors who each had the responsibility for separate lines of production equipment, “When downtime strikes, who’s line is most important?” They all raised their hands in the air. Maintenance resources are finite on a given shift as a rule. We need to deploy these based on asset criticality (risk) and priority. Here is an example of a priority matrix.
Notice that there are three planned work priorities beyond the PMs. This approach allows us to segment our planned work. The concept ensures that we are working on all priorities rather than just a single “routine” work class or priority code where less critical work falls off the radar screen and frustrates the requestor. Their response due to lack of action is to re-enter the work request as “safety” to hopefully guarantee the work completion.
Planning to Ensure Reliability in All Your Assets
Planned work avoids delays, ensures materials availability and drives the efficiency of the crafts. When I talk about craft efficiencies, I’m not asking people to work harder. I am simply trying to give them the tools, materials and access to the equipment so they can do their job better. From an asset reliability perspective, job plans developed by the planner are written to a specification, i.e., torque values, gaps, fits, clearances, belt tension settings, alignment tolerances, etc. Enforce the use of the standardized work procedures. The intent is to eliminate poor work behaviors and human error. The human error rate is 40 percent on average, not to mention, if everyone is doing it their way, which way is the right way? When you have failures, which way potentially caused the failure?
Unfortunately, if organizations have a maintenance planner scheduler position, only about 10 percent of all maintenance planners/schedulers are utilized correctly per the best practices. Often these planners/schedulers have received no formal planning and scheduling training or coaching in the position. I find sometimes 10 to 12 planners/schedulers at sites. These individuals have been in the role for years, sometimes eight years or longer, with no understanding of their roles and responsibilities.
Effective planning is a key component to ensuring asset reliability in the short and long term.
Proactive Scheduling for Asset Reliability
Poor maintenance scheduling practices, or lack thereof, are a measure of the reactivity in the organization. In reactive organizations, partnerships between maintenance and production are limited, and many times siloed. As one example, a recent site where I taught a planning and scheduling course had no formal meeting to share production planning schedules, production requests or maintenance requirements to build a maintenance schedule. The technicians themselves were left to negotiate downtime across many different functions to perform maintenance. The organization was very reactive, and maintenance costs were higher than necessary.
As a minimum, there should be a weekly schedule completed in the current week for the following week. Ideally, we prefer better. A forward-looking two-week scheduling horizon. You already know the PMs that will be triggered up to a year or so in advance. These provide the base of the schedule. You also know of engineering projects on the horizon. Add to that the delivery of specific long lead items.
Think of it as a conveyor. At the head of the conveyor, items begin to drop on (PMs). As the conveyor advances to the current week, other items drop on the conveyor such as the material deliveries and engineering work. The conveyor continues to move forward to the current week. It’s like time; it doesn’t stop moving forward. Shorter forecast items start dropping on the conveyor. Think of the conveyor discharge as the current week. These items build the schedule for next week.
An effective schedule ensures that we have the time to do the work right and reduce potential rework from rushed efforts. Coupling effective planning with scheduling is another driver toward increased asset reliability.
Continuous Improvement Loops to Move Forward
On work completion, specifically planned and scheduled work, we must have a continuous improvement loop to improve the processes. Many proactive organizations issue a feedback form with the work order package. Using Deming’s concept of the Plan, Do, Check and Act cycle, this piece is the check portion. We get to address the following questions:
- Where you able to complete the job?
- Was the scope of the job correctly identified?
- Was the time estimate appropriate for the work?
- Did we have the right materials?
- Were the listed tasks and specifications correct?
- Is any follow-up work required?
- Did we identify the correct crafts?
Ask other relevant questions. The point is to improve constantly. We want to improve items such as the job plans, along with the skill and knowledge of all involved. To me, there is no such thing as a perfect job plan. There is always room to get better.
Auditing and Measuring for Success
Much like the continuous improvement loop, auditing gives us a method to ensure our processes are working as intended. If not, we can adjust based on what we have learned. To audit, you should be randomly pulling three or so completed work orders from the stack. Take the plant manager, maintenance manager, planner scheduler, storeroom coordinator and technician(s) who completed the work. Walk down the job. Was the:
- Work properly scoped?
- Was the asset/component properly identified at the right level (lowest) in the asset hierarchy?
- Was the work planned and the plan correct with tasks, parts and priority as examples?
- Were the parts and materials correctly kitted and staged?
- Was the work completed as scheduled?
- Did the feedback form get properly utilized with adequate closure information?
- Did work order completion and closure occur?
If we find issues from the audit, we need to review the business processes? What is not working well and how do we improve the process? While poor efforts on behalf of the technician may be evident on the walk down, the point of the audit is continuous improvement. We need to look at the entire process, not the technician alone. There may be valid reasons for an exception, and we need to understand that to improve.
A plant manager at a brewery once told me, “We measure what we treasure.” In addition to the feedback loop and the audit process, we must measure to improve. There are lots of metrics that organizations can review. In the end, focus on what truly adds value and drives behaviors. From a work execution perspective, I would suggest four core metrics for starters.
- PM/PdM Compliance
- Schedule Compliance
- Schedule Break-ins
- Inventory Turns by Month
The first two of these are available on the SMRP website. Depending on the behaviors I am trying to drive, I may measure these in terms of hours in addition to the number of work orders. The third metric is simply a listing and count of the items that took higher priority over items on the schedule. It is OK to break the schedule provided we are breaking it for higher priority work. The fourth metric gives an indication on storeroom performance.
If cherry-picking and implementing a few of these steps in isolation, they will provide an improvement in asset reliability in the short term. However, the efforts are not sustained over time. Think and implement more holistically. Each of the eight steps outlined when integrated together forms a comprehensive foundation for success. Together, they combine to address the elements of people, processes and profit, which are fundamental to any business transformation. If you don’t know where to begin, seek help either internally or externally. Focus on behaviors first. Change begins with one. Let’s get started.
This article was previously published in the Reliable Plant 2017 Conference Proceedings.
By Jeff Shiver, CMRP, CPMM, CRL, Certified RCM2 and RCM3 Practitioner