Reliability Centered Maintenance: 9 Principles of Modern Maintenance

In this article I provide a brief history of the development of Reliability Centered Maintenance (RCM). And from there we explore 9 Modern Maintenance Principles. As a maintenance & reliability practitioner you should know these principles and live by them.

 

Fix it when it breaks

For most of human history, we’ve had a very simple approach to maintenance: we fixed things as they broke. This served us well from our early days huddled around campfires until about World War II.

In those days industry was not very complex or highly mechanized. Downtime was not a major issue and preventing failures wasn’t a concern. At the same time, most equipment in use was simple and more importantly, it was over-designed. This made equipment reliable and easy to repair. And most plants operated without any preventive maintenance in place. Maybe some cleaning, minor servicing and lubrication, but that was about it.

This simple ‘fix it when it breaks’ approach to maintenance is often referred to as First Generation Maintenance. 1

 

Things changed during World War II.

Wartime increased the demand for many, diverse products. Yet at the same time, the supply of industrial labour dropped. Productivity became a focus. And mechanization increased. By the 1950’s more and more complex machines were in use across almost all industries. Industry as a whole had come to depend on machines.

And as this dependence grew, it became more important to reduce equipment downtime. ‘Fix it when it’s broken’ no longer suited industry. A focus on preventing equipment failures emerged. And the idea took hold that failures could be prevented with the right maintenance at the right time. In other words, the industry moved from breakdown maintenance to time-based preventive maintenance. Fixed interval overhauls or replacements to prevent failures became the norm.

This approach to preventive maintenance is known as Second Generation Maintenance. 2

 

More Maintenance, More Failures

Between the 1950s and 1970s, a third generation of maintenance was born in the aviation industry.

After World War II air travel became widely accessible. And passenger numbers grew fast. By 1958 the Federal Aviation Administration (FAA) had become concerned about reliability. And passenger safety.

At the time the dominant thinking was that components had a specific life. That components would fail after reaching a certain “age”. Replacing components before they reached that age would thus prevent failure. And that was how you ensured reliability and passenger safety.

In the 1950’s and 1960’s the typical aircraft engine overhaul was every 8,000 hours. So when the industry was faced with an increasing number of failures, the conclusion was easy. Obviously component age must be less than the 8,000 hours that was being assumed. So, maintenance was done sooner. The time between overhauls reduced.

Easy, right?

But, increasing the amount of preventive maintenance had three very unexpected outcomes. Outcomes that eventually turned the maintenance world upside down.

First of all, the occurrence of some failures decreased. That was exactly what everybody expected to happen. All good.

The second outcome was that a larger number of failures occurred just as often as before. That was not expected and slightly confusing.

The third outcome was that most failures occurred more frequently. In other words, more maintenance lead to more failures. That was counter-intuitive. And a shock to the system.

 

The Birth of Reliability Centered Maintenance (RCM)

To say that the results frustrated both the FAA and the airlines would be an understatement. The FAA worried that reliability had not improved. And the airlines worried about the ever-increasing maintenance burden.

So during the 1960’s the airlines and the FAA established a joint task force to find out what was going on. After analyzing 12 years of data the task force concluded that overhauls had little or no effect on overall reliability or safety.

For many years engineers had thought that all equipment had some form of wear out pattern. In other words, that as equipment aged the likelihood of failure increased. But the study found this universally accepted concept did not hold true.

Instead, the task force found six patterns describing the relationship between age and failure. And that the majority of failures occur randomly rather than based on age.

The task force findings were used to develop a series of guidelines for airlines and airplane manufacturers on the development of reliable maintenance schedules for airplanes.

The first guideline titled “Maintenance Evaluation and Program Development” came out in 1968. The guide is often referred to MSG-1 and was specifically written for Boeing 747-100.

The maintenance schedule for the 747-100 was the first to apply Reliability Centered Maintenance concepts using MSG-1. And it achieved a 25% to 35% reduction in maintenance costs compared to prior practices.

As a result, the airlines lobbied to remove all the 747-100 terminology from MSG-1. They wanted the maintenance schedules for all new commercial planes designed using the same process.

The result was MSG-2, released in 1970 titled “Airline/Manufacturer Maintenance Program Planning”.

 

Amazing results from the first applications of Reliability Centered Maintenance (RCM)

The move to 3rd Generation or Reliability Centered Maintenance as outlined MSG-1 and MSG-2 was dramatic.

The DC-8’s maintenance schedule used traditional, 2nd Generation Maintenance concepts. It required the overhaul of 339 components and called for more than 4,000,000 labour hours before reaching 20,000 operating hours.

Compare that to the maintenance schedule for the Boeing 747-100, developed using MSG-1. It required just 66,000 labour hours before reaching the same 20,000 operating hours! 3

Another interesting comparison is to compare the number of items requiring fixed-time overhauls. The maintenance for the DC-10 was developed using MSG-2 and required the overhaul of just 7 items versus the 339 on the DC-8.

And both the DC-10 and Boeing 747-100 were larger and more complex than the DC-8.

Impressive results. And the US Department of Defense (DoD) thought so too.

 

The US Department of Defense gets involved

So in 1974 the DoD asked United Airlines to write a report on the processes used to write reliable maintenance programs for civilian aircraft. And in 1978 Stan Nowlan and Howard Heap published their report. It was titled “Reliability Centered Maintenance”.

Since then a lot more work was done to progress the cause of Reliability Centered Maintenance. The airline industry has moved to MSG-3. John Moubray published his book RCM2 in the 1990’s introducing Reliability Centered Maintenance concepts to industry at large.

Nowadays, RCM maintenance is defined through international standards. But it’s the work done in the 60’s and 70’s that culminated in the Knowlan & Heap report in 1978 that all modern day RCM maintenance gapproaches can be traced back to.

That’s now more than 40 years ago. So any Maintenance & Reliability professional should be familiar with it by now. It’s been around long enough. It’s well documented. And widely available.

Unfortunately we find that’s not the case. The principles of modern maintenance as developed in the journey to Reliability Centered Maintenance are not always known or understood. Let alone applied.

The rest of this article will outline those principles. They should underpin any sound maintenance program.

One of the best summaries of these principles can be found in the NAVSEA RCM Handbook. 4 I would highly recommend reading it. It is well written and easy to understand. And the following Principles of Modern Maintenance are very much built on the ‘Fundamentals of Maintenance Engineering’ as described in the NAVSEA manual.

 

9 Principles of Modern Maintenance

Whether you are developing a new maintenance program. Or improving the maintenance program for an existing plant. All reliable maintenance programs should be based on the following Principles of Modern Maintenance:

Principle #1: Accept Failures

Principle #2: Most Failures Are Not Age Related

Principle #3: Some Failures Matter More Than Others

Principle #4: Parts Might Wear Out, But Your Equipment Breaks Down

Principle #5: Hidden Failures Must Be Found

Principle #6: Identical Equipment Does Not Mean Identical Maintenance

Principle #7: “You Can’t Maintain Your Way To Reliability” 5

Principle #8: Good Maintenance Programs Don’t Waste Your Resources

Principle #9: Good Maintenance Programs Become Better Maintenance Programs

As a Maintenance & Reliability professional you must understand these principles.

You must practice them.

You must live by them.

 

Principle #1: Accept Failures

Not all failures can be prevented by maintenance. Some failures are the result of events outside our control. Think lightning strikes or flooding. For events like these, more or better maintenance makes no difference. Instead the consequences of events like these should be mitigated through design.

And maintenance can do little about failures that are the result of poor design, lousy construction or bad procurement decisions.

In other cases the impact of the failure is low so you simply accept a failure (think general area lighting).

So, good maintenance programs do not try to prevent all failures. Good maintenance plans and programs accept some level of failures and are prepared to deal with the failures they accept (and deem credible).

 

Principle #2: Most Failures Are Not Age Related

As explained above the research by the airline industry has shown that 70% – 90% of failure modes are not age-related. Instead, for most failure modes the likelihood of occurrence is random. Later research by the United States Navy and others found very similar results.

This research is summarised in the six different failure patterns shown below: 6 7

 

Reliability Centered Maintenance failure patterns

 

Apart from showing that most failure modes occur randomly. These failure patterns also highlight that infant mortality is common. And that it typically persists. That means that the probability of failure only becomes constant after a significant amount of time in service.

Don’t interpret Curves D, E, and F to mean that (some) items never degrade or wear out. Everything degrades with time, that’s life. But many items degrade so slowly that wear out is not a practical concern. These items do not reach wear out zone in normal operating life.

So what do these patterns tell us about our reliable maintenance programs?

Historically maintenance was done in the belief that the likelihood of failure increased over time (first generation maintenance thinking). It was thought that well timed maintenance could reduce the likelihood of failure. Turns out that for at least 70% of equipment this simply is not the case.

For the 70% of equipment which has a constant probability of failure there is no point in doing time-based life-renewal tasks like servicing or replacement.

It makes no sense to spend maintenance resources to service or replace an item whose reliability has not degraded. Or whose reliability cannot be improved by that maintenance task.

In practice this means that 70% – 90% of equipment would benefit from some form of condition monitoring. And only 10% – 30% can be effectively managed by time based replacement or overhaul.

Yet most of our PM programs are full of time based replacements and overhauls.

 

Principle #3: Some Failures Matter More Than Others

When deciding on whether to do a maintenance task consider the consequence of not doing it. What would be the consequence of letting that specific failure mode occur?

Avoiding that consequence is the benefit of your maintenance.

The return on your investment.

And that is exactly how maintenance should be seen: as an investment. You incur a maintenance cost in return for a benefit in sustained safety and reliability. And as with all good investments the benefit should outweigh the original investment.

So, understanding the consequences of failures is key to developing a good maintenance program. One with a good return on investment.

Just as not all failures have the same probability, not all failures have the same consequence.

Even if it relates to the same type of equipment.

Consider a leaking tank. The consequence of a leaking tank is severe if the tank contains a highly flammable liquid. But if the tank is full of potable water the consequence might not be of great concern.

Easy, right?

But what if the water is required for fire fighting?

Same tank, same failure but now we might be more concerned. We would not want to end up in a scenario of not being able to fight a fire because we had an empty tank due to a leak.

Apart from the consequence of a failure you also need think about the likelihood of the failure actually occurring.

Maintenance tasks should be developed for dominant failure modes only. Those failures that occur frequently and those that have serious consequences but are less frequent to rare. Avoid assigning maintenance to non-credible failure modes. And avoid analyzing non-credible failure modes. It eats up your scarce resources for no return.

A maintenance program should consider both the consequence and the likelihood of failures. And since Risk = Likelihood x Consequence we can conclude that good maintenance programs are risk based.

Good maintenance programs use the concept of risk to assess where to use our scare resources to get the greatest benefit. The biggest return on our investment.

 

Principle #4: Parts Might Wear Out, But Your Equipment Breaks Down

A ‘part’ is usually a simple component, something that has relatively few failure modes. Some examples are the timing belt in a car, the roller bearing on a drive shaft, the cable on a crane.

Simple items often provide early signals of potential failure, if you know where to look. And so we can often design a task to detect potential failure early on and take action prior to failure.

For those simple items which do “wear out” there will be a strong increase in the probability of failure past a certain age. If we know the typical wear out age for a component, we can schedule a time-based task to replace it before failure.

When it comes to complex items made of of many “simple” components, things are different.

All those simple components have their own failure modes with its own failure pattern. Because complex items have so many, varied failure modes, they typically do not exhibit a wear out age. Their failures do not tend to be a function of age, but occur randomly. Their probability of failure is generally constant as represented by curves E and F.

Most modern machinery consists of many components and should be treated as complex items. That means no clear wear out age. And without a clear wear out age performing time based overhauls is ineffective. And wasteful of our scarce resources.

Only where we can prove that an item has a wear out age does performing a time based overhaul or component replacement make sense.

 

Principle #5: Hidden Failures Must Be Found

Hidden failures are failures that remain undetected during normal operation. They only become evident when you need the item to work (failure on demand). Or when you conduct a test to reveal the failure – a failure finding task.

Hidden failures are often associated with equipment with protective functions. Something like a high-high pressure trip. Protective functions like these are not normally active. They are only required to function by exception to protect your people from injury or death. To protect the environment from a major impact or protect our assets from major damage. This means we pretty much always conduct failure finding tasks on equipment with protective functions.

To be clear, a failure finding task does not prevent a failure. Instead a failure finding task does exactly what it’s name implies. It seeks to find a failure. A failure that has already happened, but has not been revealed to us. It has remained hidden.

We must find hidden failures and fix them before the equipment is required to operate.

 

Principle #6: Identical Equipment Does Not Mean Identical Maintenance

Just because two pieces of equipment are the same doesn’t mean they need the same maintenance. In fact, they may need completely different maintenance tasks.

The classic example is two exactly the same pumps in a duty – standby setup.8 Same manufacturer, same model. Both pumps process the exact same fluid under the same operating conditions. But Pump A is the duty pump, and Pump B is the standby. Pump A normally runs and Pump B is only used when Pump A fails.

When it comes to failure modes Pump B has an important hidden failure mode: it might not start on demand. In other words, when Pump A fails or under maintenance you suddenly find that Pump B won’t start. Oops.

Pump B doesn’t normally run so you wouldn’t know it couldn’t start until you came to start it. That’s the classic definition of a hidden failure mode. And hidden failure modes like this require a failure finding task i.e. you go and test to see if Pump B will start. But you don’t need to do this for Pump A because it’s always running (unless when it’s off or failed).

So when building a maintenance program you must consider the operating context.

A difference in criticality can also lead to different maintenance needs. Safety or production critical equipment will need more monitoring and testing than the same equipment in low criticality service.

It’s important to reinforce that identical equipment may need different maintenance requirements. This is far too often forgotten or simply ignored for convenience. But you could find yourself facing critical failures by ignoring this basic concept. Especially if you use a library of preventive maintenance tasks.

 

Principle #7: “You Can’t Maintain Your Way to Reliability”

I love this quote from Terrence O’Hanlon and it’s so very true. Maintenance can only preserve your equipment’s inherent design reliability and performance.

If the equipment’s inherent reliability or performance is poor, doing more maintenance will not help.

No amount of maintenance can raise the inherent reliability of a design.

To improve poor reliability or performance that’s due to a poor design, you need to change the design. Simple.

When you encounter failures – defects – that relate to design issues you need to eliminate them.

Sure, the more proactive and more efficient approach is to ensure that the design is right to begin with. But all plants startup with design defects. Even proactive plants. And that’s why the most reliable plants in the world have an effective defect elimination program in place.

 

Principle #8: Good Maintenance Programs Don’t Waste Your Resources

This seems obvious, right? But when we review PM programs we often find tasks that add no value. Tasks that waste resources and actually reduce reliability and availability.

It’s so common for people to say “whilst we do this, let’s also check this. It only takes 5 minutes.”

But 5 minutes here and there, every week or every month and we’ve suddenly wasted a lot of time. And potentially introduced a lot of defects that can impact equipment reliability down the line.

Another source of waste in our PM programs is trying to maintain a level of performance and functionality that we don’t actually need.

Equipment is often designed to do more that what it is required to do in its actual operating conditions. As maintainers we should be very careful about maintaining to design capabilities. Instead, in most cases we should maintain our equipment to deliver to operating requirements. Maintenance done to ensure equipment capacity greater than actually needed is a waste of resources.

Similarly, avoid assigning multiple tasks to a single failure mode. It’s wasteful and it makes it hard to determine which task is actually effective. Stick to the rule of a single, effective task per failure mode as much as you can. Only for very high consequence failure modes should you consider having multiple, diverse tasks to a single failure mode.

Most organisations have more maintenance to do than resources to do it with. Use resources on unnecessary maintenance, and you risk not completing necessary maintenance. And not completing necessary maintenance, or completing it late, increases the risk of failures.

And when that unnecessary maintenance is intrusive it gets worse. Experience shows that intrusive maintenance leads to increased failures because of human error. This could be simple mistakes. Or because of defective materials or parts, or errors in technical documentation.

A lot of maintenance is done with the equipment off-line. So doing unnecessary maintenance can also increase production losses.

So make sure you remove unnecessary maintenance from your system. Make sure you have a clear and legitimate reason for every task in your maintenance program. Make sure you link all tasks to a dominant failure mode. And have clear priorities for all maintenance tasks. That allows you to prioritise tasks. In the real world, we are all resource constrained.

 

Principle #9: Good Maintenance Programs Become Better Maintenance Programs

The most effective maintenance programs are dynamic. They are changing and improving continuously. Always making better use of our scarce resources. Always becoming more effective at preventing those failures that matter to our business.

When improving your maintenance program you need to understand that not all improvements have the same leverage:

First, focus on eliminating unnecessary maintenance tasks. This eliminates the direct maintenance labour and materials. But it also removes the effort required to plan, schedule, manage, and report on this work.

Second, change time based overhaul or replacement tasks into condition based tasks. Instead of replacing a component every so many hours, use a condition monitoring technique to assess how much life the component has left. And only replace the component when actually required.

And third, extend task intervals. Do this based on data analysis, operator and maintainer experience. Or simply on good engineering judgment. Remember to observe the results.

The shorter the current interval, the greater the impact when extending that interval. For example adjusting a daily task to weekly reduces the required PM workload for that task by more than 80%.

This is often the simplest and one of the most effective improvements you can make.

 

References

I wrote this article based on a number of key sources listed below (and throughout the article). I strongly recommend getting yourself a copy of Moubray’s book if don’t already own a copy. And I’d definitely get the NAVSEA RCM manual as its well written and easy to understand:

 

Have you implemented these principles?

If so let me know how it went by leaving a comment below. Or if you’ve struggled with some of this, feel free to ask a question or share you experience:

72 Comments

  1. Jude on 6th Mar, 2018 at 12:20 AM

    Good one Eric. Very much pleased with the simple and concise manner of the article. My area of challenge is on developing a good maintenance programme. I really observed that most of maintenance program being carried out are not really value adding as it doesn’t prevent failure. it was beacuse most of the programme were lifted from OEM manual without doing proper RCM study on that asset. Most of the maintenance tasks were not preventing any failure mode and yet they are being carried out. The big question now becomes, “why are we still having so many equipment failures and yet we carryout preventive maintenance activities”?
    I suggested carrying out PMO to remove non value adding tasks and probably replace with value added ones to improve equipment reliability. Gradually, things are changing.

    • Erik Hupjé on 6th Mar, 2018 at 11:51 AM

      Thanks for your comment Jude – great to hear that you initiated preventive maintenance optimisation in your plant and that you’re starting to see the results come through.

  2. Carlos Romero on 12th Mar, 2018 at 7:58 PM

    Hi Erik. Great, yet simple way to help explaining many non-maintenance professionals about the basics of good, proactive maintenance programs (excellent to be used amongst proejcts engineers with very limited maintenance understanding).
    One minor view: On PRINCIPLE 5 I suggest you clarify that Hidden failures must be found as long as these are Critical. Agree that most hidden failures are critical but not 100% of these are, hence the effort to find ALL, including the not-so-critical might hinder the PM program, e.g. manual valve fail to close due to debris (most manual valves are not critical to the process design).
    Thanks for the time taken to complile this article …..wise PM thoughts!.
    Cheers, Carlos

    • Erik Hupjé on 12th Mar, 2018 at 8:29 PM

      Hi Carlos, thanks for your comment. You’re 100% right: we don’t want to looking for hidden failures that don’t really matter to our business. If we did we would be wasting our valuable resources on PM tasks that don’t really matter. In doing so we’d be violating Principle #8. Thanks again, I’ll update the article to make this more clear.

  3. Kader MELLEL on 24th Mar, 2018 at 5:34 AM

    Thanks sir for the very important article and the manner that you give us the history of maintenance used and developed in the other domains aeronautical engineering and marine corps, thanks a lot.

    • Erik Hupjé on 24th Mar, 2018 at 6:55 AM

      You’re very welcome. I hope you found the article useful.

  4. Mike Cook on 24th Mar, 2018 at 7:34 AM

    Excellent article. I wish more maintenance managers (and their managers) could be made to understand this. Over my career I have seen many wasted efforts and non-evidence based approaches to maintenance. A lot of these have been a result of senior management who don’t understand what maintenance is all about and maintenance managers who can’t communicate this effectively.

    • Erik Hupjé on 24th Mar, 2018 at 12:12 PM

      Hi Mike, thanks for your feedback. What you raise is exactly one of the goals of Road to Reliability: to influence how senior management see maintenance and help maintenance managers communicate more effectively to their management.

      • mak on 12th Jun, 2018 at 1:05 PM

        I agree with Mike.Some times it’s difficult to influence senior management due to long hierarchy of approvals and involvement of multiple departments for small change in complex organization.
        But it’s always good to put self effort to achieve self satisfaction and if it goes run then it will be great for organization and for self learning.

        • Agnes on 4th Sep, 2018 at 4:49 PM

          This is true. Unfortnately in most organisations the decision makers are non technical so they do not value or prioritise maintenance. Some even ask you that ´why fix it if its not broken´. There is really need to find a way to make the non technical managers understand the value of maintenance and the benefits of moving with technological trends in maintentance.
          One seemingly minor defect picked during maintenance can save a whole drive train and plant at large.

      • Benson Palijah on 16th Sep, 2018 at 7:32 PM

        This is excellent. It will make a lot of difference in the results achieved.

  5. Waseem Ghani Waraich on 4th Apr, 2018 at 2:55 PM

    Hi Erik! I must say it was informative and concise article on this very demanding subject. When it comes to take senior management on board, I found cost based analytical approach, to establish if a maintenance task or a design change worth doing, an effective tool.

    • Erik Hupjé on 4th Apr, 2018 at 3:24 PM

      Thanks Waseem. You’re absolutely right, if we can express the benefits of what we do in money (either as a cost saving or as production increase) it is much easier to get senior management on board.

      • Dr Edwin Browne PhD on 5th Apr, 2018 at 3:34 PM

        Hello Erik
        Nice to note your efforts to improve the awareness for increased productivity through RCM.
        The article has covered all the major area where normally people do err in the strategy.
        I heard RCM performed in PDO yields the expected results.
        Any project in RCM – please feel free to contact.
        Dr Edwin Browne at [email protected]

  6. Waleed AL-Riyami on 4th Apr, 2018 at 10:26 PM

    Very interesting and useful article Erik and helped me a lot to figure out how to optimize the Preventive Maintenance tasks in my company since most of them are wasting for resources.
    Thanks a lot.

    • Erik Hupjé on 5th Apr, 2018 at 3:23 PM

      Hi Waleed, thanks for your feedback. Many PM programs are wasteful of resources, yours is definately not the only one. Let us know how you go with improving it.

  7. DEELIP PRABHUDESAI on 8th Apr, 2018 at 4:44 AM

    Excellent article, very well compiled. Especially the 9 principles made a very logical and interesting reading. Worth reading for every maintenance professional.

    • Erik Hupjé on 9th May, 2018 at 12:46 PM

      Thank you Deelip, glad you enjoyed it

  8. Mutunga on 11th Apr, 2018 at 4:41 PM

    I did enjoy the article. Very informative.

  9. Jeremiah Jack on 24th Apr, 2018 at 10:29 PM

    Excellent article. I total agree with Mike Cook when it comes to management decision on equipment maintenance. I wish a lot of maintenance manager and their ups managers understand this excellent article. I have seen many wasted efforts even up till now. I need to secure my career opportunity in the future where I can utilize my diverse experience in the industry. Provide the energy people need in a reliable and sustainable method in an environment where I will be more expose and get more people to Join me in driving the change in maintenance organization.
    Excellent article. Well Done Sir

    • Erik Hupjé on 25th Apr, 2018 at 6:55 AM

      Thank you – please feel free to share the article

  10. David Trocel on 9th May, 2018 at 4:49 AM

    Thanks for sharing your vision. This is a very good article, with historical references. I would like your permission to translate it into Spanish and publish it in the 21st edition of the Confiabilidad Industrial magazine. (www.confiabilidad.com.ve) Again thanks.

    • Erik Hupjé on 9th May, 2018 at 6:21 AM

      Sure David, please drop me an email at [email protected] so we can discuss practicalities and any help you might need.

      • Promise on 9th May, 2018 at 8:36 AM

        I enjoy reading every line of this article,very informative and I am hoping that someday soon I will have the opportunity to develop and roll out a properly structured maintenance plan.thanks once again

        • Erik Hupjé on 9th May, 2018 at 8:47 AM

          Glad you enjoyed it! When the time comes feel free to reach out if you need help

  11. Hesam on 9th May, 2018 at 12:10 PM

    Thanks Erik for sharing this article. That includes great history of RCM as well as 9 key principles of modern maintenance (Principles of RCM). It worths publishing in a good journal. Thanks Erik.

  12. Aviel First on 10th May, 2018 at 12:24 AM

    Thanks Erik,
    You should also mention Predictive Maintenance SW that are become today popular.
    see for example http://www.precog.co

    • Erik Hupjé on 10th May, 2018 at 9:04 PM

      Thanks for your comment Aviel. I think any predictive maintenance solution like your Precognize needs to be built on these key principles. But I agree that we are probably on the brink of a new Generation of Maintenance (the 5th depending on how you look at it) that will be heavily influenced by IIoT and AI… but the Big Data approach will just lead to Big Problems if we lose sight of the basics that underpin maintenance.

  13. GP Mishra on 11th May, 2018 at 10:45 AM

    Very nice and informative article. One basic principle of maintenance is not considered here i.e. Believe in your maintenance program and be focused. It is very common that maintenance programs are hijacked by production team and their prioties are entirely different and are offen more influential. This pushes maintenance to divert resources to non-critical and non- important tasks..

  14. Mohammed tawili on 12th May, 2018 at 3:08 AM

    Thanks Eric and is really useful history of maintenance where I realized many people still agnorant about it.

    • Erik Hupjé on 17th May, 2018 at 9:07 PM

      Thank you Mohammed, we need to make a conscious effort to help those around us understand these basic principles of maintenance.

  15. Geert Paul Weeda on 13th May, 2018 at 6:41 PM

    Very interresting article. When configurations are as accurate as possible you have some more profits. I think (and not only me) is that your basics has to be right (accurate) which starts with very accurate configurations.

  16. Deepak Kumar Rathore on 15th May, 2018 at 5:17 PM

    Nicely explained, how to save scarce resources.

  17. Dario on 4th Jun, 2018 at 6:33 AM

    Hi
    I worked mainly in turnaround projects but i would like to know the opinion of maintenance experts on below 5 points.
    1) Maintenance plans in CMMS Shall be complete workpacks for preventive and predictive task (Only sceduling no planning)
    2) Avoid not necessary or too frequent task of preventive maintenance. Identify categorize and focus in CMMS for SCM (safety critical maint.) and OCM (operation critical maint.)
    3) schedule efficiently (direct extract from Sap the wo tasks to schedule) and with a sufficient level of details in work order operation tasks in order to assess progress for each work centre and identify simile
    4) populate and uodate correctly cmms in order to plan work orders and assure a database for maintenance engineering. Moving from oreda data to company data for ram and rcm analysis (each Company is unique)
    5)assure competente resource for maintenance planning and execution and have good contracts (E.g include kpi, exhaustive sow, etc.)

    • Erik Hupjé on 6th Jun, 2018 at 7:43 AM

      Hi Dario, thanks for your comment, probably better suited with one of the planning & scheduling articles but that’s ok. I can’t answer all the queries here in a single comment, but for number (1) I agree that most of the work in CMMS should be fully planned, PM, PDM and CM. But even some PM’s can’t be fully scoped. For example in a complex petrochemical plant, you could end up doing major overhauls or inspections (during a shutdown) based on condition assessments which determine the final scope of work. This can never be fully finished in your CMMS and will require the planner to fine-tune the scope of work before the workorder & workpack is completed and issued.

  18. Redouane on 5th Jun, 2018 at 11:02 PM

    very interesting and precious book
    Many thanks MR Erik Hupjé

  19. Narender Kumar on 6th Jun, 2018 at 5:52 AM

    Dear Eric
    A great article. I am very impressed with your write-ups. Since I have started following you on Linkedin, I am agreeing you on almost every occasion. This is one of that occasion. Article is just great no other words to explain.
    Infact in my few years of experience, I have faced the problems mentioned above mainly the last one i.e. “it will take only 5 minutes more” Those 5 minutes become 5 hours without knowing.
    People since their education have been taught that PM is a necessity. We buy car & they ask us to replace oil every XXXX kms & we do it religiously. We buy AC & they ask us to get it serviced before & after summer. So the PM mentality makes a home in our mind & slowly from necessary it takes the form of NECESSARY EVIL. In my few years in the field, I have understood that more you maintain, more the chances of equipment failure will be there. Do not unnecessarily stop the equipment just to do a PM. Rather do a PdM to avoid start/ Stop (Biggest reason to make an equipment stressed in my opinion).

    Once again Thanks again Eric for increasing our knowledge.

    Hope to see you some place

    Regards

    • Erik Hupjé on 6th Jun, 2018 at 7:36 AM

      Thanks! Gald to hear you enjoy the articles and that you recognise the issues based on your own experience. It’s a small world so who knows, we might one day meet indeed!

  20. Volker on 11th Jun, 2018 at 7:47 PM

    Hi Eric,

    I enjoy reading your summary article.
    Reliability basics is all about so called ” Enablers” that build and motivate health.
    We create ever more complexity into our machines and systems designed by highly competent individuals who look at various operational requirement aspects and/or windows of improvement opportunity, Great way , but the technicians, engineers, Operators or even the product support is mostly not privi to the knowledge and/or skill to support or enable himself to operate/ support the “new” product and only learns with time, failure prevention starts at day ONE , NOT once it happens.
    This summarise’s not only to have a effective operational readiness program, but an enabling (here all inputs and outputs need to coordinate) “health program” understood by all levels of work top to bottom.

    • Wilson Mwanza on 12th Jun, 2018 at 8:05 PM

      Thanks for those educative articles, i have really learned a lot.

  21. Rudi Frederix on 26th Jun, 2018 at 8:01 PM

    Eric,

    Very use full article certainly for those young reliability engineers who need that simple explained information. As you stated in your text above, we intend to do way to much maintenance. First define your criticality of your equipment before your define the type of maintenance performed even consider if necessary at all.

    So again a great article.

  22. Mark on 25th Jul, 2018 at 5:51 AM

    Great article Eric. I was interested to hear the history behind it.
    I moved to a company 3.5 years ago and after implementing the techniques mentioned above we have drastically improved reliability, although there is lots more to do.
    Changing the mindset is often the biggest challenge.

  23. Simon Daly on 25th Jul, 2018 at 8:06 AM

    Very useful article even in my field which deals with non mechanical items and their failure. Some of the similarities and terminology was very educating.

  24. Mike Hobbs on 25th Jul, 2018 at 6:14 PM

    Hi, a nice clear read.
    Principal #6 is all about context, maintain for the consequences driven by the context the equipment is placed in. A good example is, would you maintain the brakes on a van or an ambulance any differently? Or, would you check the starting system on them any differently? Clearly in the first the safety consequences are the same regardless, whereas in the second the consequences are different and thus may have different preventative maintenance. In both cases the equipment could be made from identical components.

    • Erik Hupjé on 25th Jul, 2018 at 8:42 PM

      Thanks Mike indeed Principles 6 is all about operating context. Like your comparison of a normal van vs an ambulance.

  25. Simon Makoni on 26th Jul, 2018 at 1:43 AM

    Thanks Erik, it’s my 1st time to ready your article and I found out that some of the the principles you mentioned saves the company’s scarce resources and minimise downtime. I’m surely going to implement some of these principles.

    Thanks again

  26. Hans Lutsch on 26th Jul, 2018 at 7:37 AM

    Great stuff! It nicely sums up the Art of Maintenance and Reliability. I would be cautious against generalizing with the following quote

    “Similarly, avoid assigning multiple tasks to a single failure mode. It’s wasteful and it makes it hard to determine which task is actually effective. Stick to the rule of a single, effective task per failure mode.”

    After more than thirty years of aviation maintenance and reliability experience I know of plenty of cases where doing more than one single maintenance task per failure mode is required to obtain the best operational reliability and life cycle cost for the asset. It all boils down to analyzing the failure mode and its maintenance cost vs the benefits. I have seen too often where Quotes like these steer management into false assumptions on how this all works.

    • Erik Hupjé on 26th Jul, 2018 at 7:52 AM

      Hi Hans, thanks for your comment and feedback. I’ll tweak the text to make it clear that it should be seen as good practice to stick to one task per failure mode, but that there can be exceptions, especially where the consequences of the failure mode are very significant. It’s a fine balance between keeping things as simple as possible and generalizing too much.

      • Theuns Koekemoer on 4th Aug, 2018 at 9:11 PM

        Erik, Aladon RCM2 and now RCM3 has always catered for this specific situation through the specific logic on the decision diagrams

  27. IrWCSoh on 28th Jul, 2018 at 7:22 AM

    I focus on Overhaul, Repair Overhaul of Industrial Rotodynamic Machinery as a vendor.
    Many of many customers carried out diagnostic checks and produce fantastic reports. As overhauler, we often can see a direct connection of defects and wears to the data’s from these diagnostic checks, and see the weaknesses or “sickness” or “diseases” in each particular equipment. These findings and experiences normally do not reach to those maintaining the equipment or managing the RCM system.
    Recently some of my customers tap on my knowledge and experiences to eliminate chronic issues that they had for years.
    Many of these failures actually can be traced to vendors (often the lowest charges) who practice only focus in parts changing and little measurements to check for imperfections of components.
    Some equipment failures resulted from the systems of which the equipment are related to.

    • Erik Hupjé on 28th Jul, 2018 at 8:46 AM

      “These findings and experiences normally do not reach to those maintaining the equipment or managing the RCM system.”

      Indeed a common problem but a clear example of a broken process – what is the point of doing these checks if the people who own the strategy don’t see the outcomes and results of their strategy? Maintenance needs a continuous improvement process and that means closing the loop.

  28. Essam Ghonem on 11th Aug, 2018 at 8:16 PM

    Hello dear
    I work in water and sewage stations on 100 acres and more than 1000 units
    In the beginning a great article can add additional items if authorized by me
    1. Preventive maintenance is not a constitution and can be changed as needed
    2. Poor operation is the major problem and failure to take action before, during and after operation
    3 – Senior management and its correlation with the cost of maintenance with production

  29. Augustine Isodje on 31st Aug, 2018 at 1:33 AM

    Great Article! Worth Reading…

  30. Nigel Maponga on 4th Sep, 2018 at 9:10 PM

    Thank you Erik for this informative article. I am more into Reliability Engineering , the best approach for me has been to carry out an equipment criticality analysis, FMEA and developing maintenance tactics , analyze these to see which failure modes they address and take it from there. Failures will occur , but they can be mitigated. I also have a very useful manual – Rules of Thumb for Maintenance and Reliability Engineers.

    • Erik Hupjé on 5th Sep, 2018 at 7:12 AM

      Hi Nigel, thanks for your comment. Sounds like you have an effective approach in place and indeed “Rules of Thumb for Maintenance and Reliability Engineers” is a great book by Ricky Smith!

  31. Amin on 7th Sep, 2018 at 12:35 AM

    Good article explain modern concepts of engineering management, and the effect of use modern maintenance to repair engineering problems.
    Thanks @erik

    Best Regard

  32. Jayson Jadraque on 11th Sep, 2018 at 6:31 PM

    Hi Sir Erik,

    Great article, I learned a lot from the history of maintenance until present and I can now say that we are still practicing the 2nd generation maintenance. Hope someday I could fully introduced the RCM and proactive approach on our maintenance practices.

    Keep going, you are helping a lot for us in the plant maintenance industry

    I remain,

  33. Paul moore on 28th Sep, 2018 at 3:30 AM

    Absolutely fantastic read……. We had just started to look at this when out terminal was taken over…. I must get my new employer to get back on board..

    Thanks

  34. Manorom Chiewpanich on 29th Sep, 2018 at 5:53 PM

    It’s funny. I lived with RCM and facilitated RCM workshop for many years. But this is my first time knowing RCM history and reaIized misunderstanding RCM was originated from military. Actually, I also develop RCM software based on FMECA with quantitative appoach using Weirbull to automatic determine interval and maintenance practice. But I had been stuck in automatice report. So I stopped for a while. . Now I have already had data science people. I will continue my RCM soonest.

  35. Markus on 6th Oct, 2018 at 2:30 AM

    Thanks a lot Eric for your research and your article for Modern Maintenance I´ll appreciate it and it´will be forwarded to my maintenance managers

  36. Alok Shrivastava on 9th Oct, 2018 at 9:50 PM

    Hi Erik, It is a good article.I have experienced many time the OEM use a component on an equipment which is designed to fail , shaft dia bearing type and size.There can be two theories one that the component failure prevents major damage or to increase the sale of components.Since most of the OEM do not give specification it becomes quite a task to redesign knowing fully well that the component used is of wrong design.

  37. Rashed Jafari on 27th Oct, 2018 at 6:18 PM

    Hi Erik,

    Thanks for this concise and applicable article. I wonder if you could tell me that how can we optimize the PM tasks and especially the scheduled main overhaul on Gas turbines.

    BR
    Rashed

  38. Mauricio Cisneros on 9th Nov, 2018 at 1:04 PM

    Yes, in the last four years whean I was working in a copper open pit mine; the maintenance develoment was very important for improvement all task maintenance in any machine or equipment but this is not the end, because the same prinicipies, we apply in other segments like civil construction, steel construction, hydraulic systems.

    WorKing in base a maintenance program plannig and use other maintenance technical activities like maintenance predictive; can be of grate help.

    And if we looking more; we can use the quality management system for measure and improvement the maintenance so much more.

  39. Askari Syed Hazoor on 14th Nov, 2018 at 7:26 PM

    Please share the references where you have implemented this RCM

    • Erik Hupjé on 15th Nov, 2018 at 6:08 PM

      Thanks for your comment Askari, I have sent you an email.

  40. Jean-Louis Pérée on 16th Nov, 2018 at 2:34 PM

    Hi Erik, I am so glad you released this “walkthrough” around the benefice of RCM. The Q/A you received demonstrates a confidential utilization still requiring support to be understood by more and more individuals. I was and is still a supporter of this methodology when in charge of various Maintenance Organizations involved in manufacturing, design and integration. I was even proud to have met with John MOUBRAY when we both considered an extended version to be called RCM II. At that time, I was working on a mathematical model called Reliability Growth Model which became lately MIL-HDBK-189c (SPLAN & SSPLAN). Our intent was to “plug-in” a “simple .xls software” called NRG II. The phonetic version of this name was in fact “to”. Our idea was to promote “RCM II improve Direct Maintenance Cost, Total Cost of Ownership, Dispatch Reliability, … name it”. But John passed away before we could finalize the idea. John and I were in fact taking the opportunity I was the Director of Maintenance Engineering of BOMBARDIER Aerospace and later on Transportation… . Very good souvenirs of success stories for all my colleagues and customers. Please keep on promoting this concept which preserves us against multiple AI initiatives which are progressively transforming us into “simple” actors of replacing component because the “AI application” told us to do so. Our “Brain” is our tool, we must preserve it and train it to control our destiny. Jean-Louis PEREE

    • Erik Hupjé on 29th Nov, 2018 at 1:49 PM

      Thank you for the comment Jean-Louis, I totally agree with you that even in a future full of AI and big data we will be still required to understand and apply the fundamentals.

  41. Rio Amor Tiampong on 27th Nov, 2018 at 3:21 PM

    Great. It’s really a worth to read blog. I will surely make this into consideration and make it as guide in our maintenance program.. Thanks Erik.

Leave a Comment





Send this to a friend