Let’s look at a root cause analysis example for a pump. And not focus on the process of the analysis, but instead focus on the data that is collected during this process. The pump in question is part of a water pumping system used to draw water from a reservoir at a hydraulic power generation plant. Work orders for two failures were created and closed recently against this pump and it was decided to perform a root cause analysis for both.
For the first failure it was observed that the pump stopped pumping completely.
With the second failure it was still pumping but with a lot less output and the pump was also leaking badly.
First Failure – Strainer Missing
Operators entered the work order with a description indicating the pump stopped pumping and entered ‘Stopped’ in the Symptom field. Maintenance found the power tripped but resetting that did not fix the issue. It tripped again. They opened the pump and removed a tree branch that got in the pump. After that the pump system was reset and restarted and is now back in operation.
For this example we did not consider the divers needed for this specific job.
On the root cause analysis that was started for this failure the following data could be entered for the problem Description and the 5 whys:
Problem Description: Not pumping any water.
-
Why 1: Pump not running.
-
Why 2: The pump high-current overload protection tripped out the pump motor.
-
Why 3: A large branch in the water entered the system and got stuck in the pump case to jam the impeller.
-
Why 4: The water strainer screen(s) that should prevent trash and other foreign matter from entering the system was not installed.
-
Why 5: Pump suction strainer screen(s) forgotten after previous work performed.
After the 5 whys it should be easy to fill in What Happened. For example: This failure is due to human error as personnel forgot to install the pump suction strainer(s). This will likely damage the pump internals, stop the pumping system operation causing lost time and production. This will also result in a tricky & dangerous (unsafe) repair job partly in a confined space. It may also require contracting underwater divers and special rigging/rental equipment at a significant and unexpected expense.
And a high-level Solution: For example: Consider attaching pump strainer chains so they are always attached to the structure. And additionally, schedule refresher training for pump maintenance personnel to replace strainers after pump work.
And depending on how you use work order closing codes you could end up with the following set of information:
-
Symptom: Stopped
-
Problem Code: Not applicable
-
Failure Code: No output
-
Cause Code: External influence/blockage/plugged
-
Action Code: Serviced
-
Tactical Cause: Human Factor
-
Human Factor: Workmanship
-
Workmanship: Assembly error
-
Human Oversight: Procedure Inaccurate
Since there is only one root cause there is no need to enter data on the Root Causes tab. You could leave it at this and close the case or you can use the advanced features and add this information to the RCM data you have already collected for this pump. For that purpose, on the Failure Modes tab, you could add the following:
-
Function: To transfer at least 10,000 GPM of water from Reservoir 1 to by-pass A.
-
Functional Failure: Unable to transfer water from Reservoir 1 to by-pass A at all.
-
Failure Mode: Pump suction strainer screen(s) forgotten after intervention
-
Noun: Pump suction strainer screen(s)
-
Verb: forgotten
-
Complement: after intervention
-
-
Effect Indicator: Not applicable
-
Effect Consequence: As a result of personnel forgetting to install the pump suction strainer(s), trash and other foreign debris will damage the pump internals, stop the pumping system operation causing lost time and no water being transferred to the reservoir by-pass spillway. The reservoir water levels will rise and, if no backup water pumping out of the reservoir can be arranged, the reservoir will spill over its banks onto the native reservation located below the reservoir valley. Should this occur, the result ranges from nuisance flooding of native lands to loss of wildlife or even loss of human life, depending on how much overflow occurs. This will also result in a tricky and dangerous (unsafe) repair job partly in a confined space. It may also require contracting underwater divers and special rigging/rental equipment at a significant and unexpected expense.
And as part of the RCM data, you can also add mitigation strategies against the defined failure modes on the Failure Mitigation tab of the case. Mitigation strategies could be:
-
Consider attaching pump strainer chains so they are always attached to the pump casing.
-
Schedule refresher training for pump maintenance personnel to replace strainers after pump work.
-
Review that the pump work SOP includes replacement steps for strainers after pump work is done.
The selected Mitigation Type for all these could be ‘Redesign’.
Second Failure – Seal abraded
Operators entered the work order with a description indicating the pump stopped pumping and entered ‘Stopped’ in the Symptom field. They added there was a lot of water around the pump. Maintenance found the pump vibration sensor had tripped the power to the motor. After resetting that the motor started and then they observed a very noisy, shaking, leaking pump that barely pumped any water. The pump was cavitating. The leak was at the shaft they observed. They switched off the system and replaced the pump seal. After that the pump system was reset and restarted and is now back in operation.
On the root cause analysis that was started for this failure the following data could be entered for the problem Description and the 5 whys:
-
Problem Description: Leaking badly and not pumping enough water. Pump is cavitating.
-
Why 1: The pump high-vibration sensor tripped out the pump motor as the pump had low output, and was shaking, noisy, and vibrating (i.e. cavitating).
-
Why 2: Pump seal has failed. The pump was leaking out water creating an environmental breach and water was not being transferred downstream.
-
Why 3: Pump seal is worn. The leak was observed where the water pump seal meets the shaft.
-
Why 4: Pump seal abraded by poor quality water containing solid particles and other chemicals that form abrasive matter.
After the 4 whys, note that the fifth why was not needed in this case, it should be easy to fill in What Happened. For example: Over a Period of time, regular pump operation with poor quality water containing solid particles and other chemicals form abrasive matter wears down the surface material of the seal. This eventually causes the seal to leak when the seal material gets too thin. Continued use makes the leak progress from shaft wetness, followed by water dripping out, followed by water streaming or gushing out of the pump seal area.
And a high-level Solution: For example: Consider inspection of the seal on a regular basis.
And depending on how you use work order closing codes you could end up with the following set of information:
-
Symptom: Leaking
-
Problem Code: Pump unit - Seals
-
Failure Code: External leakage - process medium
-
Cause Code: Material failure / erosion
-
Action Code: Replaced
-
Tactical Cause: Normal Wear
-
Human Factor: O&M Procedure Incorrect
-
Workmanship: Not applicable
-
Human Oversight: Procedure Inaccurate
Since there is only one root cause there is no need to enter data on the Root Causes tab. You can also add this information to the RCM data for this pump. For that purpose, on the Failure Modes tab, you could add the following:
-
Function: Since you have already defined a function for this pump (see the above case example where the strainer was missing) there is no need to create it again. Instead of filling in Function therefore you should now enter Existing Function instead and select: To transfer at least 10,000 GPM of water from Reservoir 1 to by-pass A.
-
Functional Failure: The same is true for the functional failure. That was also defined previously and that can be used here as well. So, you can leave Functional Failure blank and instead fill in Existing Functional Failure: Unable to transfer water from Reservoir 1 to by-pass A at all.
-
Failure Mode: Pump seal abraded
-
Noun: Pump seal
-
Verb: abraded
-
Complement: <blank>
-
-
Effect Indicator: Over a Period of time, regular pump operation with poor quality water containing solid particles and other chemicals form abrasive matter that wears down the surface material of the seal. This eventually causes the seal to leak by when it gets too thin. Continued use makes the leak progress from shaft wetness to water dripping out, followed by water streaming or gushing out of the pump seal area. For example, use the following codes to identify the current situation:
-
N = water pump seals no wetness or leaks
-
W = water pump seals wet or seeping
-
A = water pump seals dripping
-
C = water pump seals streaming, gushing, or spraying water
-
-
Effect Consequence: This will lead to no water being transferred to the reservoir by-pass spillway. The reservoir water levels will rise and, if no backup water pumping out of the reservoir can be arranged, the reservoir will eventually spill over its banks onto the native reservation located below the reservoir valley. Should this occur, the result ranges from nuisance flooding of native lands to loss of wildlife or even loss of human life, depending on how much overflow occurs. If not acted upon before the pump shuts down, this may also result in a tricky and dangerous (unsafe) repair job partly in a confined space. It may also require contracting underwater divers and special rigging/rental equipment at a significant and unexpected expense.
And as part of the RCM data, you can also add mitigation strategies against the defined failure modes on the Failure Mitigation tab of the case. Mitigation strategies could be:
-
Visually inspect water pump seals (qty 2) as per Maintenance Procedure MP 16.1a. When some water seepage found, schedule a Work Order to follow Maintenance Procedure MP 16.1b to replace water pump seals (qty 2).
-
N = water pump seals no wetness or leaks
-
W = water pump seals wet or seeping
-
A = water pump seals dripping
-
C = water pump seals streaming, gushing, or spraying water
Done by: Mechanic
(P-F)/2 Interval: 6 months
-
-
Consider adding a water flow sensor to the pump so that the pump can alert you when forward water flow is less than expected. In a well-designed system the water transfer pump should not trip out so abruptly and cause unscheduled downtime. Instead, an alarm is raised before it’s too late and the pump completely shuts down and you can plan a scheduled outage to replace the seals at your convenience. Besides the serious consequences that can be avoided, doing this may also prevent follow-on secondary damage to the pump that occurs as the seal damage progresses, such as during the lead-up to pump cavitation.
-
Schedule refresher training for pump operators and maintenance personnel to understand how to operate and maintain the system after adding the flow sensor to the pump.
-
Review the pump operating and maintenance procedures and make sure they include the new steps that consider the water flow sensors on the pump after this work is done. Note that this must include new failure modes in, in perhaps different functions (i.e. leaking -> environmental) and functional failures (i.e. low-flow -> partial), that the flow sensors may introduce that were not originally considered because they were not there.
The selected Mitigation Type for the pm inspection could be ‘PM - On-condition task’ and for the remaining 3 it could be ‘Redesign’.
The root cause analysis and RCM data of the above two examples can now be easily copied to the RCM tab and the RCM Failure Mitigation tab of selected equipment and RCM Templates. Just use the Corrective Action tab on the Case Management screen and select your target equipment and templates and then click Apply Corrective Action. Remember that you may have to do some mapping before you click that button if the target equipment or template already have data on the RCM tab.
Also consider using one case for both failures. It may not be the most obvious here given the examples, but if you find multiple root causes and failure modes it is possible to use the Root Causes tab to enter as many root causes as you found and also on the Failure Modes tab and Failure Mitigation tab feel free to enter as many records as required. You could even consider using the case to build the complete RCM and mitigation data for a new piece of equipment or any equipment you would like to include in your RCM Strategy.
All suggested codes in these two examples are examples themselves. The actual codes used vary widely and depend on your configuration and implementation of the EAM software.