Mastering Root Cause Analysis with XMPro: Capture, Value, Impact

Unlock the full potential of Root Cause Analysis in our in-depth webinar with Nicole Scheinbach, Engineering Consultant at XMPro. This session is a treasure trove for professionals eager to streamline their RCA processes using XMPro's sophisticated application. 📊 Nicole takes you through the intricacies of RCA, teaching you to not only capture recommendations but also to assess their impact and value comprehensively. Imagine having a centralized system where every recommendation is immediately accessible, leading to quick, informed decisions that translate into measurable financial results. XMPro's application blueprints exemplify the platform's capability to transform insights into action. These blueprints are customizable, ensuring they fit a wide array of business needs, from those new to RCA to those seeking to enhance their existing condition monitoring applications. Ready to enhance your RCA proficiency? Join us to navigate the nuances of XMPro's RCA application and discover how to make it your own. 🔗 Blueprint Download: This RCA blueprint will soon be available for download on our blueprints, accelerators, and patterns page. Stay tuned! https://xmpro.github.io/Blueprints-Accelerators-Patterns/ 🕒 Key Webinar Segments: [2:16] What is Root Cause Analysis (RCA)? [6:00] Exploring the Benefits of Conducting an RCA [11:20] Introducing an XMPro-based Root Cause Analysis Application [28:55] A Guide to XMPro's Blueprints, Accelerators & Patterns Don't forget to like, share, and subscribe to our channel for more insightful webinars and tutorials on leveraging XMPro for your business success! #XMPro #RootCauseAnalysis #BusinessProcessManagement #RCAApplication #BlueprintsAndPatterns #CustomizableSolutions #EngineeringExcellence #Webinar #XMProTutorial

Transcript

hello everybody and welcome to our last

webinar for 2023 um today I have the uh

pleasure of Nicole's company who's going

to run us through root cause analysis

application uh going through some

terminology um and then jumping into

some of the ex and pro specific uh

pieces can you drop to the next slide

please

ni you can just pull the whole slide um

some of the areas that we're going to

cover is um she's going to run you

through what is a root cause analysis

some types some benefits um an exm Pro

blueprint um around root cause analysis

and then we're going to touch on at the

end there just our general blueprints

accelerators and patterns as well um

next slide please so it's my great

pleasure to introduce to you our one of

our resident Engineers uh Nicole so

Nicole if you could just give everyone a

brief introduction and the floor is

yours sure thank you uh so my name is

the shinbach uh my background is in

mechanical engineering so a little bit

just about my previous roles uh before

being an engagement lead at XM Pro I was

a reliability engineer in a polymer

facility uh primary responsibilities

included PM improvements PM reviews uh

equipment upgrades or modifications

based off process changes as well as uh

root cause

analysis uh I also was a remote process

engineer that specialized in asset

condition monitoring this was primarily

through the use of iot sensors that

track vibration across um personally I

was uh over 15 different Building

Product and paper mill facilities trying

to prevent unplanned downtime on their

equipment my current role is an

engagement lead I've had a number of

clients um including we nutrient and so

primary

focus of um to create use cases to

basically capture and codify knowledge

you know to ensure it's not lost

especially when people retire as well as

to enhance and streamline current

workflows to solve any sort of problems

um that clients may

have so some basic terminology alignment

so first of all what is a root cause

analysis most people are familiar with

this term but uh just to clarify so root

cause analysis is the pro uh sorry root

cause analysis aims to identify the

causes of a problem in order to identify

actions to help solve the issues so um

at a high level right you want to you

want to be identifying the actual causes

of a failure not necessarily the

symptoms I shown on the diagram on the

Left Right the symptoms are

typically uh what you you know you

actually see I.E a pump bearing has

locked up but why has the pump bearing

locked up and so through the root cause

analysis process the aim is to address

these causes and not necessarily just

the

symptoms uh next going into the

different types of root cause

analysis so there are many many types of

root cause analysis um we're going to

focus primarily on the ones that are of

the you know cause and effect type

analysis I listed here three that are

quite popular the first is the 5y

approach this is what our solution is

Loosely based B off of um and we'll go

into that once I launch into the demo uh

at a high level of 5y is an iterative

technique to explore the causes and

effects of un underlying uh certain

issue so usually people describe this as

when a child basically asks you know why

something has happened and they

continually ask why eventually they give

up or they get the answer they want um

in the same fashion right if you keep

asking why you're going to get more and

more into the detailed uh

the details of a problem until you

actually reach the root

cause um next we have the fishbone

diagram also called the Ishikawa diagram

so this is a visual method to organize

the cause and effect relationship into

categories there is a little fish on

that diagram to the right that is

typically what the structure does look

like the head is typically the actual um

problem and then the associated bones

are the different categories of uh

causes so this is used across multiple

Industries and there's different uh

pneumonics that people typically use for

the categories one common one is the 6m

for manufacturing there's also I believe

a 4M there's different um there's

different M categories based on how many

categories you want to go through the 6m

is basically Manpower method machine

materials mother nature and measurements

so that's a good starting point when

your team wants to start putting

Associated causes underneath a larger

category um finally there is the Paro

chart so the Paro chart aims to identify

the frequency and impact of a problem it

sometimes follows the 8020 rule where

most um most problems 80% of the

problems are caused by 20% of the causes

and so in a Paro chart you know you're

looking at the leftmost of the chart

basically which is you know what uh what

issue are are happening at a high

frequency as well as causing the most

downtime so typically uh my experience

with this is that at the end of the year

it's a good look back of a cumulative

impact my reliability engineer would

usually sit down our team right go

through basically look this is the

frequency and impact of some issues and

you know try to assign work accordingly

for the next year to address

those next uh benefits of doing a root

cause analysis so why would you want to

do this right the main I guess the main

benefit to doing this is to prevent

reoccurrence of issues in the future

right the whole goal is basically X Y

and Z happen causing a failure right so

you're wanting to create action items to

resolve the root cause of a specific

failure versus a symptoms again

referencing the tree right the symptoms

are what you visually see or visually

happens and the root cause you know

that's what you need to ident identify

and

resolve additionally um improved team

communication so you know working with

your colleagues there's multiple

disciplines there's you know a cross

functional team that you know provides

support for the whole process right so

you need to ensure first of all that the

full picture is captured right so you

want to involve as many relevant people

as possible in your RCA process so

everything that is relevant is captured

understood and addressed via team

collaboration you also Al want

everyone's you know um input to ensure

everyone is on the same page and agrees

what the steps are to ensure this

problem does not happen again right

everyone needs to agree finally in terms

of documentation right this

documentation is important you first

want to you know validate that you have

done your due diligence you've captured

all the associated data and evidence uh

as well as Associated action items to

you know close the loop and ensure this

doesn't happen again and this is also a

great way to in um share across your

organization so if you're in

manufacturing typically you're going to

have you know Associated sister sites

that are doing a similar process to you

which might have the identical equipment

so ideally you're wanting to share your

experience with them obviously not a

great experience but to ensure that this

doesn't happen to them again right or

doesn't happen to them in the same way

that the failure has happened um your

site now in terms of the right diagram

so this is sort of how the different um

steps and uh pieces that you need to

fully complete an RCA uh the first is

the problem identification so basically

what has happened um and how you're

going to you know capture that so you

know X Y and Z failed at this time and

this impact the next is the data

collection portion so this is an

important portion where you know again

utilizing your cross functional team you

want to be gathering all the necessary

data so you know PM plan operations rout

process data anything of relevance to

your failure you need to capture in a

timeline so you can see you know

potentially when the actual issue

cropped up next cause mapping and

identifying the root cause right you're

actually performing the the root cause

analysis and identifying what the issue

is finally closing the loop you need to

create actions that are addressing the

root cause or causes and ensure that

they are implemented so this issue does

not happen

again so how does this process integrate

into our XM Pro existing process so

typically um you know depending on the

client there may be a very uh very

specific use case or problem that they

want to address or it might be something

more broad for example reducing

unplanned downtime right that's quite a

broad statement and that's you know

something OB ly very common amongst um

sites you want to ensure that um we're

trying to identify the right items to

address this issue so typically we go

through this process of we first

identify you know the Bad actors right

you can do that eventually via Paro

chart um you want to identify ones that

are you know either frequently failing

or causing a massive impact on

production um operations next you go

into the failure modes right what is

actually causing the this bad actor to

fail you need to identify that to

properly address it um and then you know

coming into the root causes so again uh

identifying the root causes is the most

important part because if you don't

identify the proper root causes you're

not going to be addressing the correct

problem now items to the right basically

you know is the rest of our process here

you know we we identify now that we know

the root causes any of the leading

indicators what data sources we need to

integrate with and then you know

Associated recommendations with that so

um today though we are going to be

focusing on uh the root cause review so

we wanted to bring this uh blueprint

essentially to make available to people

because uh basically we have you know

implemented solutions for clients we now

are enabling you know clients and you

know even new clients right to utilize

our um root cause analysis application

to identify you know what kind of root

root causes are creating potential you

know downtime availability losses

anything like that you can utilize our

platform to create solutions to address

these

issues the XM ba XM Pro based rot cause

analysis

application so now I'm going to take you

through the

actual um the actual demo uh and you can

see kind of what we've provided in terms

of a a

blueprint second I

bring all right so we have here um the

demo so I'll quickly go through the

basic pieces of this landing page so

you'll land here the first part is uh

the left the number of failures per AET

type for the last 12 months months so

this is based off um ISO 224 which is

actually the structure that we've

utilized uh to create our um our

variation of the

fivey uh the iso code basically goes

over how to capture data in a quote

unquote like reliability format you know

to ensure when you're doing an analysis

you know later in the year everything's

captured into you know appropriate

categories so you can analyze in the

future uh this is is used across

multiple uh

manufacturing uh facilities as well as

you know different equipment types so

here for our sample right we have a

centrifugal pumps we have the broad

categories for failure mechanisms as

electrical failures external influences

material failures and mechanical

failures now coming on to the right card

action items due soon so this is at a

high level everything that you or all

the rcas that you've created

all the actions that are due so you know

this is great if you need to look up you

know potentially one that you have been

assigned to you know double checking

which ones you have and when they're due

or at a high level perhaps you know

maintenance manager looking at all the

associated action items that need to be

due and you doing any sort of necessary

followup finally we get to the bottom

card here the all root CA analysis card

so this uh this card allows you to

actually go through and um look back at

your for existing root cause analysis

look at any pending action item just

look at any of the timelines anything

like that once they're completed they

will be stored here so again for

documentation purposes you can reference

them in the

future so right now we're going to go

through and create uh a new RCA so you

can just see the general

process so first uh as mentioned in the

PowerPoint the failure details so we

just want to capture it high level first

of all what has happened and the

associated Financial impact right that's

the most important thing um you know you

typically do root cause analysis for you

know extremely high impact things you

know that you need to address um so for

this demo we have uh centrifugal pumps

uh as the as the asset type now going

forward you can add any Associated asset

types that you want so if you've got

Heat exchangers fans anything of that

nature you can add a structure in there

uh to add them just to your um RCA

application I'm going to go ahead and

just copy and paste some of this data in

so you don't have to watch me uh watch

me type here so we have a asset ID

equipment ID and what basically happen

so there was a pump and it was shut down

due to a high overall vibration in the

de bearing so this happen uh the

beginning of the month we're now trying

to evaluate it while everything is fresh

facility so uh this client is based out

of this fictional client is based out of

Texas and they have basically two areas

um of their facility in terms of safety

impact there was no safety impact a bit

of operational impact and a large uh

larger production impact

here so after completion of all these

fields it will automatically sum up

double check all your items here and you

can click save and continue oh apologies

I did not add a zero there there is some

validation on these fields these fields

are required so you do need to fill them

all in this is all necessary

information going on to our next phase

um timeline so this is the data you know

collection portion that's really you

know vital to your uh your cause mapping

right you need to identify all the

associated events that could have per U

that could have cumulated accumulated in

your failure now if you also notice up

here there are associated breadcrumbs

this provides additional navigation

between the pages as well as let I mean

let your team know essentially that this

is how many parts you still have to do

to complete your

RCA so coming back to um you have your

cross-sectional team basically available

and they're digging through and they've

noticed

that uh way back in June right we we

installed a new

assembly um and this was of normal um

normal maintenance there was an overhaul

and we just uh we installed a new

rebuilt assembly

here um digging through your cmms

records you notice that

unfortunately

um unfortunately here there was a

failure and it was all Hands-On deck the

failure of a fan and unfortunately a

scheduled TM was not completed so this

was for lubrication of that U of that

home uh now a couple days later there is

also a scheduled uh vibration route

that's done um it does not pick up any

sort of anomalous uh overall vibration

yet right maybe you know the bearing is

still okay at this point now

unfortunately

these these um lubrication PMs and B

routes they only happen every couple of

months right so everything looks to be

okay until an operator basically comes

up um and he's doing his normal routes

and he can he can hear something wrong

with this bearing um at that point it's

too late right uh your your bearing's

probably your bearing's probably done

what he does is he he tells his manager

um and his manager basically calls up um

the Rel liability and maintenance team

and they take another reading um

basically before the scheduled reading

basically on the day that it's taken

down the vibe comes and says look this

is a stage forbearing failure at this

point you need to shut it down I have no

idea when it's going to fail and we

don't want to just have a random

unplanned uh downtime in the middle of

the night when there's no

support so um your cross-sectional team

has basically gone through and put

together the series of events uh they

think you know this is good enough but

we think we have an idea now of what

could be what would have caused this

issue now we still need to capture

everyone that has participated so you

know obviously for documentation

purposes you want to capture everyone

that um is part of this first of all we

have uh Bob CA Bob Costa is a process

engineer we also want to

cap

well um you want to capture this for

documentation purposes but you also want

to ensure um that everyone here is

captured because when

you uh assign the action items uh you

can only assign the action items for

people that were captured here so again

you want to make sure everyone is

captured here so we next have Jill Smith

she's a reliability engineer she also

works at

company last but not least we have uh

Max berson he is the maintenance team

lead so he has provided his input into

this RCA and he is

Robson

comp.com okay so now we have our

participants now we have our timelines

um you know we want to go ahead and uh

save and

continue um as we go to this part

someone says oh you know I think I need

to revise part of my timeline okay so we

go back uh via the breadcrumbs here and

he says you know I want to make it clear

that my operator he informed me ASAP and

we tried to get this done as as soon as

possible so we want to add a an

additional note here it says um you know

operator

notified integer

immediately

well

okay so uh there is a couple additional

functions here one of them is the save

button so you can see the save button is

to the top right of each card if you do

need to make modifications you can go go

ahead and do so after a certain point in

the RCA you'll no longer be able to make

modifications right for documentation

purposes people can't just come back and

continually make modifications but at

this point right you haven't done the C

map you can make modifications you can

also delete and I don't necessarily want

to delete here but if you click here and

click delete you can you know delete

anything maybe um you're doing a

revision with your team and you decide

oh this event actually didn't happen or

you know potentially we need to shift

around some things you can go ahead and

delete and upload you know the necessary

information but we're going to go ahead

and save and

continue but coming onto the failure

analysis part right so this is the most

important part right ensuring that you

you capture the correct failure analysis

as well as you know identifying the root

cause so you can have corrective actions

to take so what failure mechanism so

again uh utilizing

14224 the iso code right there is highle

buckets that we want to place

again for documentation purposes in the

future so this uh overall vibration you

know causing uh bearing failure that

would typically be considered a

mechanical

failure H what kind of mechanical

failure was this well you know it was

related to specifically vibration and

why did this happen so what caused um

you know what did the vibration do

essentially to kill the pump and in this

case uh it created a bearing failure and

um the bearing failure was eventally

going to um freeze up the pump and the

pump was going to stop stop rotating so

after this part you're you're saying

okay so the failure mechanism which is

you know at the highest level what you

you you visibly see is that the bearing

failure due to vibration caused the pump

to fail now what kind of uh comments or

additional information can you

provide you go through and look at um

your system maybe your Vibe system and

you know you analys that and analyze

that data and you find that um you know

like in your timeline that these these

bearing uh readings indicated a stage

for failure and then um there was

indication of bpfo so basically at this

point your Vibe Tech is recommended

please you know shut down again we don't

want unplanned um

outages again this is kind of like the

higher level what you actually see now

we come down to the actual um failure

causes right so you know the bearing you

know um unless there was a manufacturing

defect right the bearing just doesn't

fail by itself it has it has some help

here right so in terms of um what we've

dug in through the timeline basically it

looks like a PM was missed and um don't

know if it's like within the system if

there's some way to you know ensure the

PM is done but essentially the p m is

missed right so that is kind of failur

to like the management the workflow

system right something something is not

aligning this is a critical piece of

equipment and when the PM is missed we

want to ensure that it is done again

right um we can't just be mying

PM so in terms of that um it's it's sort

of a CMM you know cmms potentially or

documentation error right basically or

you know potentially a management error

depending on on you know which one

you're team um goes for basically so um

you know the management of the PMS needs

to be re-evaluated you need to look at

basically uh how we can ensure that

critical PMS are are completed or

rescheduled you know um if the you know

potentially was done the day after this

may not have been an issue right so um

after you've evaluated with your team

you leave some comments

basically um and apologies um so the

definitely because of this uh unplanned

failure another piece of equipment all

hands were on Deck this PM was missed

and unfortunately because this PM only

happens uh not on a high frequency uh it

was not known until basically the

bearing was in a stage where failure

that was going to um that was going to

cause an unexpected

failure so this is your uh cause map

once you're happy with this you can go

to save and continue

and this is the final stage this is the

action portion so you can see here this

is the cosm that you just created now if

you do need to modify anything you can

go back to the failure analysis tab uh

via the breadcrumbs but right now you

can basically take a look at this um and

identify what sort of actions you need

to take to address this so in this case

um one of the things is that basically

the PM was missed right so we want to

identify why the PM was missed

um and some some notes here

basically because this was missed um we

need to discuss with maintenance how to

mitigate this in the F future I guess

the current practice is basically the PM

is closed and we wait for the next one

to come along but for certain PMS right

especially ones like this that can't be

the current

practice um max berson he is the

maintenance um he is in charge of kind

of assigning the different PMS he can

take a look at how we can potentially

address this in the future we should

probably give him some time right uh

probably at least a month or two right

he needs to go back through double check

double check what the current practice

is and communicate that to his

te next U we're noticing that basically

because there was no Contin continuous

vibration monitoring on this piece of

equipment um it was only picked up

because an operator heard the sounds

right which is that point it's not uh

it's not

savable and so you know we've been

hearing about all these iot sensors

right they they uh they returned

real-time data this could have

potentially caught it before it became a

stage for uh bearing failure why don't

we look into some of these iot sensors

there's a lot on the market right but um

you know based off our process and you

know needed temp requirements and things

like that we can probably find something

that we can install on there and ensure

that we are seeing the vibration data in

real

time so Jill Smith she is a reliability

engineer she's going to go through she's

going to take a look at any sort of

associated um iot sensor companies that

look like they could be a good fit for

our

application all right new

actions so theme is deliberated okay we

feel that these actions are are good um

and are addressing the issues now if we

do need to come back and add actions in

the future can go ahead um if we you

know think about something else or you

know X Y and Z otherwise we feel we feel

confident that this is going to address

this issue so now that you've created

your actions you're going to go ahead

and return

home and now you can see that um these

are um by RCI um you can see that this

one was created

um was created and so now you can go

back and reference that in the

future

all right um so that is the

demonstration again um a couple quick

items to note here so you can navigate

to the rcas Via here um so for example

if you want to see

um see this RCA another sample RCA you

see oh I want to reference like what was

discussed here it might have been a

while you can go back and see based off

the action item the RCA similarly you

can go back and reference the RCA

here oh

is not created it

is let's see this now you can basically

see what the information is that was

captured in that

RCA okay now I'm going to go back over

to Gavin um and he's going to finish out

this uh this

presentation thank you very much uh

Nicole the the one thing I will mention

is this is a blueprint so what that

means is if there's anything extra you

want to change you want to add to you

can adapt this to your own processes you

can change this to to your own way of

doing things um all the data is um

captured if you can go to the next slide

please Nicole all the data is captured

and the the other thing is even if you

flag something is deleted it's not

deleted from the database it's flagged

so you can't actually bring it up

however you can put a ton of metrics on

top of that and actually bring a lot of

that up we will be expanding on the

blueprints as well um and adding a lot

of different feedback options for the

reporting Etc so where can you find it

um it is part of our blueprints

accelerators and patents um we covered

that in a prior webinar what that is how

to access it um on that page if you hit

the landing page bottom right there's an

RSS feed if you click that it'll

actually um give you the RSS feed you

can load into your outlook Etc and then

you can be informed whenever we publish

new ones um out this particular one uh

should be out um just after these

webinars just before the holidays um and

you'll be able to access it uh and and

go from there you can also contribute to

these as well so if you have anything

you feel that you need to contribute uh

please don't be shy next slide

please and with that um thank you Nicole

for uh running us all through that um

this will be uh it for the webinars for

the year so we'll be taking a short

break uh for all the holidays that

everyone's going to go on and we'll be

seeing you in February

2024 um so be safe thank you for the

Fantastic um 2023 for attending feedback

and and comments um and we will pick

this up in February of next year thank

you all thank you

everyone

PreviousMastering Health Check Endpoints: A Guide to Ensuring Service Uptime and Performance with XMPro NextMicrosoft Azure Digital Twins Everything You Need To Know

Last updated 2 months ago

Was this helpful?