THE REPUBLIC OF TURKEY BAHCESEHIR UNIVERSITY PROACTIVE MAINTENANCE OF THERMAL POWER PLANTS UNDER LIMITED OBSERVATIONS Master’s Thesis MO`TASEM ABUSHANAP ISTANBUL, 2013 THE REPUBLIC OF TURKEY BAHCESEHIR UNIVERSITY GRADUATE SCHOOL OF NATURAL AND APPLIED SCIENCES INDUSTRIAL ENGINEERING PROACTIVE MAINTENANCE OF THERMAL POWER PLANTS UNDER LIMITED OBSERVATIONS Master’s Thesis MO`TASEM ABUSHANAP Supervisor: ASSIST.PROF. DEMET ÖZGÜR ÜNLÜAKIN ISTANBUL, 2013 THE REPUBLIC OF TURKEY BAHCESEHIR UNIVERSITY GRADUATE SCHOOL OF NATURAL AND APPLIED SCIENCES INDUSTRIAL ENGINEERING Name of the thesis: Proactive Maintenance of Thermal Power Plants Under Limited Observations Name/Last Name of the Student: Mo`tasem Abushanap Date of the Defense of Thesis:5/6/2013 The thesis has been approved by the Graduate School of Natural and Applied Sciences. Assoc.Prof.Tunç BOZBURA Graduate School Director I certify that this thesis meets all the requirements as a thesis for the degree of Master of Science. Assist. Prof. %DUÕú 6(/d8. Program Coordinator This is to certify that we have read this thesis and we find it fully adequate in scope, quality and content, as a thesis for the degree of Master of Science. Examining Comittee Members Signature____ Thesis Supervisor ----------------------------------- Assist. Prof. Demet ÖZGÜR ÜNLÜAKIN Member ----------------------------------- Prof.7DQHU %ø/*ød Member ----------------------------------- Assist. Prof. Ethem d$1$.2ö/8 ACKNOWLEDGEMENT First of all I would like to thank the mighty Allah for everything that I achieved. Then I ZDQW WR WKDQN P\ VXSHUYLVRU $VVLVW. 3URI. 'HPHW g]JU hQODNÕQ IRU EHLQJ SDWLHQW DQG IRU all the help she provided. Also I would like to thank Prof. Taner Bilgiç and Assist. Prof. (WKHP dDQDNR÷OX IRU DOO WKH DGYLFHV DQG notes they gave. And I appreciate the visit of Prof. 0HKPHW %DUÕú g]HUGHP IRU WKH DGYLFHV KH SURYLGHG. Many thanks to my father and my mother for all the support they gave. Thanks to my Fiancée for standing and supporting me. And finally thanks to everyone who contributed in any way to this work. ABSTRACT PROACTIVE MAINTENANCE OF THERMAL POWER PLANTS UNDER LIMITED OBSERVATIONS MO`TASEM ABUSHANAP INDUSTRIAL ENGINEERING THESIS SUPERVISOR: DEMET ÖZGÜR ÜNLÜAKIN JUNE 2013, 103 In the past few decades, because of the technological revolution, interactions between the components of a system become more sophisticated which makes the maintenance decision making procedure a hard issue. At this point, effective problem diagnosis plays an important role in determining the right maintenance decisions. The objective of diagnosis is to analyze and determine the most likely causes of a problem. Hence the data or observations gathered so far helps the decision maker in effective diagnosis. On the other hand, monitoring and predicting system health is also important especially for dynamic systems where proactive maintenance is preferred instead of the reactive one. Improving system reliability by performing maintenance activities based on early diagnosis before a serious problem arises is essential in proactive maintenance. In this study, we consider the proactive maintenance decisions of thermal power plants consisting of interacting components under limited observations over a planning horizon. Maintenance activities are performed at any time by replacing either aging components or gauges in the system. The objective is to determine an optimum proactive maintenance plan in a discrete planning horizon. We use dynamic Bayesian networks (DBNs) for representation and to do fast inference. We propose two proactive maintenance methodologies and present their predicted maintenance plans. At any time when a replacement decision is made, three different diagnosis techniques, where two of them are from literature, are used. Computational analyses show that there exists no significance difference among the performances of these methods. The proposed methodologies can be easily adapted for proactive maintenance planning of other complex dynamic systems. Keywords: Thermal Power Plants, Dynamic Bayesian Network (DBN), Proactive Maintenance, Reliability ÖZET SINIRLI GÖZLEMLER ALTINDA 7(50ø. 6$175$/(5øNøN PROAKTøF BAKIMI MO`TASEM ABUSHANAP (1'h675ø 0h+(1',6/øöø TEZ '$1,ù0$1,: DEMET ÖZGÜR ÜNLÜAKIN +$=ø5$1 2013, 103 6RQ \ÕOODUGD WHNQRORML GHYULPL\OH ELU VLVWHPLQ ELOHúHQOHUL DUDVÕQGDNL HWNLOHúLPOHU JLWWLNoH GDKD NDUPDúÕN KDOH JHOPHNWH YH EX GD EDNÕP NDUDUODUÕQÕ YHUPH\L ]RUODúWÕUPDNWDGÕU. %X noktada, etkili problem WHúKLVL GR÷UX EDNÕP NDUDUODUÕQÕ EHOLUOHPHGH |QHPOL ELU URO R\QDU. 7HúKLVLQ DPDFÕ SUREOHPH \RO DoDQ HQ RODVÕ QHGHQOHUL DQDOL] HWPHN YH EHOLUOHPHNWLU. %X QHGHQOH R DQD NDGDU WRSODQPÕú YHUL YH J|]OHPOHU HWNLOL WHúKLV \DSPDVÕ LoLQ NDUDU YHULFL\H \DUGÕPFÕ ROXUODU. 'L÷HU \DQGDQ VLVWHPLQ VD÷OÕ÷ÕQÕ J|]OHPOHPH YH WDKPLQ HWPH GH |]HOOLNOH SURDNWLI EDNÕPÕQ UHDNWLI EDNÕPD WHUFLK HGLOGL÷L GLQDPLN VLVWHPOHU LoLQ oRN |QHPOLGLU. &LGGL ELU SUREOHPLQ ROPDVÕQÕ EHNOHPHGHQ HUNHQ WHúKLVH GD\DOÕ EDNÕP DNWLYLWHOHUL JHUoHNOHúWirerek VLVWHP JYHQLOLUOL÷LQL DUWÕUPDN SURDNWLI EDNÕPGD KD\DWLGLU. %X oDOÕúPDGD HWNLOHúHQ ELOHúHQOHUL RODQ WHUPLN VDQWUDOOHULQLQ NÕVÕWOÕ J|]OHPOHU DOWÕQGD ELU SODQODPD XINX ER\XQFD SURDNWLI EDNÕP NDUDUODUÕQÕ HOH DOGÕN. %DNÕP DNWLYLWHOHUL VLVWHPGHNL \DúODQDQ ELOHúHQOHULQ YH\D |OoP DOHWOHULQLQ KHUKDQJL ELU ]DPDQGD \HQLOHQPHVL LOH JHUoHNOHúPHNWHGLU. $PDo D\UÕN ]DPDQOÕ SODQODPD XINX ER\XQFD HQ L\L SURDNWLI EDNÕP SODQÕQÕ EHOLUOHPHNWLU. 3UREOHPL J|VWHUPHN YH KÕ]OÕ oÕNDUÕPODU \DSPDN LoLQ GLQDPLN %D\HVFL D÷ODUÕ '%$( NXOODQGÕN. øNL SURDNWLI EDNÕP PHWRGX |QHUGLN YH EXQODUÕQ WDKPLQL EDNÕP SODQODUÕQÕ VXQGXN. %LU EDNÕP NDUDUÕ DOÕQGÕ÷Õ ]DPDQ LNL WDQHVL OLWHUDWUGHQ ROPDN ]HUH o GH÷LúLN WHúKLV \|QWHPL NXOODQÕOGÕ. +HVDSVDO DQDOL]OHU EX \|QWHPOHULQ SHUIRUPDQVODUÕ DUDVÕQGD EHOLUJLQ ELU IDUN ROPDGÕ÷ÕQÕ J|VWHUGL. gQHULOHQ PHWRWODU NROD\OÕNOD GL÷HU NDUPDúÕN GLQDPLN VLVWHPOHULQ SURDNWLI EDNÕP SODQODPDVÕ LoLQ GH X\DUODQDELOLU. Anahtar Kelimeler: 7HUPLN 6DQWUDOOHUL 'LQDPLN %D\HVFL $÷ODU '%$( 3URDNWLI %DNÕP Güvenilirlik CONTENTS LIST OF FIGURES ........................................................................................................ IX LIST OF TABLES ............................................................................................................... X ABBREVIATIONS .......................................................................................................... XII 1. INTRODUCTION ...................................................................................................... 1 1. STEPS OF THE THESIS ..................................................................................... 4 1. OUTLINE OF THE THESIS .............................................................................. 5 2. LITERATURE REVIEW .......................................................................................... 6 2.1 RELIABILITY AND MAINTENANCE ............................................................ 6 2.2 BAYESIAN NETWORKS ................................................................................... 8 2.3 DYNAMIC BAYESIAN NETWROKS .............................................................. 9 2.4 RELIABILITY AND MAINTAINABILITY OF THERMAL POWER PLANTs ..................................................................................................................... 12 3. BACKGROUND ....................................................................................................... 14 3.1 RELIABILITY AND MAINTENANCE .......................................................... 14 3.1.1 Reliability ............................................................................................... 14 3.1.2 Maintenance ........................................................................................... 15 3.1.2.1 Reactive Maintenance .............................................................. 16 3.1.2.2 Preventive Maintenance ........................................................... 18 3.1.2.3 Predictive Maintenance and Inspection .................................. 21 3.1.2.4 Proactive Maintenance (Root Cause Maintenance) .............. 22 3.1.2.5 Reliability Centered Maintenance ........................................... 24 3.1.2.6 Diagnosis and Prognosis ........................................................... 29 3.2 PROBABILISTIC GRAPHICAL MODELS ................................................... 30 3.2.1 Basic Terminologies .............................................................................. 31 3.2.1.1 Conditional Probability ............................................................ 31 3.2.1.2 Joint probability ........................................................................ 31 3.2.1.3 Marginal probability ................................................................ 32 3.2.1.4 Independent Events .................................................................. 32 3.1.2.5 Transition Probability .............................................................. 32 3.2.1.6 Bayes Rule ................................................................................. 33 3.2.2 Decision Diagrams ................................................................................. 34 3.2.2.1 Influence Diagrams ................................................................... 35 3.2.2.2 Dynamic Decision Networks .................................................... 36 3.2.3 Bayesian Networks ................................................................................ 36 3.2.3.1 Static Bayesian Networks ......................................................... 36 3.2.3.2 Dynamic Bayesian Networks ................................................... 39 3.2.3.3 Inference in Dynamic Bayesian networks .............................. 40 4. PROBLEM DEFINITION AND MODELING...................................................... 41 4.1 THERMAL POWER PLANTs ......................................................................... 41 4.1.1 Rankine Cycle ........................................................................................ 41 4.1.2 Components in Rankine cycle .............................................................. 44 4.1.3 Systems of Thermal Power Plants ....................................................... 45 4.2 MODELING WITH DYNAMIC BAYESIAN NETWORKS ........................ 45 4.2.1 Replaceable Nodes ................................................................................. 48 4.2.2 Process Nodes......................................................................................... 48 4.2.3 Observable Nodes .................................................................................. 49 4.2.4 Non-aging (non-degrading) nodes: ...................................................... 50 4.3 ESTABLISHING THE MODEL ...................................................................... 52 4.3.1 Main replaceable components .............................................................. 52 4.3.2 Gauge Components ............................................................................... 53 4.3.3 Process Nodes......................................................................................... 56 4.3.3.1 Hidden Processes ...................................................................... 56 4.3.3.2 Observable processes ................................................................ 60 4.3.4 Transition probabilities ........................................................................ 62 4.3.5 Independent variables ........................................................................... 63 4.3.6 Domain of variables .............................................................................. 64 5. METHODOLOGY ................................................................................................... 67 5.1 ONE CRITICAL PROCESS ALGORITHM .................................................. 68 5.1.1 Notation of SCP ..................................................................................... 68 5.1.2 Algorithm of SCP .................................................................................. 69 5.2 MULTIPLE CRITICAL PROCESSES ALGORITHM ................................. 71 5.2.1 Notation of MCP.................................................................................... 74 5.2.2 Algorithm of MCP ................................................................................. 75 6. RESULTS AND EVALUATION ............................................................................ 77 6.1 DESIGN OF EXPERIMENTS .......................................................................... 77 6.2 ANALYSIS OF EXPERIMENTS ..................................................................... 87 6.3 PREDICTED MAINTENANCE PLAN ........................................................... 91 7. CONCLUSION ......................................................................................................... 94 REFERENCES ................................................................................................................... 96 LIST OF FIGURES Figure 3.1: Taxonomy of maintenance strategies ............................................................. 16 Figure 3.2: Categories of reactive maintenance ................................................................ 17 Figure 3.3: Elements of preventive maintenance .............................................................. 19 Figure 3.4: Proactive maintenance methods used to extend the life of components ........ 23 Figure 3.5: Components of RCM ...................................................................................... 26 Figure 3.6: Probabilistic Graphical representations .......................................................... 30 Figure 3.7: Transition probability ..................................................................................... 33 Figure 3.8: Influence Diagram .......................................................................................... 35 Figure 3.9: Dynamic decision networks............................................................................ 36 Figure 3.10: Bayesian Network......................................................................................... 37 Figure 3.11: a- Diagnosis Reasoning, b-Prognosis (predictive) reasoning ....................... 38 Figure 3.12: Dynamic Bayesian Networks (DBNs) .......................................................... 40 Figure 4.1: Schematic representation of thermal power plant .......................................... 42 Figure 4.2 Simple Rankine cycle ...................................................................................... 42 Figure 4.3: process flow in thermal power plant using Rankine cycle ............................. 43 Figure 4.4: Components of Rankine Cycle ....................................................................... 44 .................................................... 47 Figure 4.5: Bayesian Network for thermal power plant Figure 4.6: Types of gauges used in the model ................................................................. 54 Figure 4.7: Types of process nodes ................................................................................... 57 .................................... 66 Figure 4.8: Dynamic Bayesian Network for thermal power plant Figure 5.1: Outline of the algorithms ................................................................................ 67 Figure 5.2: Groups of MCP critical nodes ........................................................................ 73 LIST OF TABLES Table 1.1: Power plant types and their properities ................................................................ 3 Table 3.1: Applications of maintenance stratigies ................................................................ 28 Table 3.2: Differences between diagnosis and prognosis ..................................................... 29 Table 3.3: Transition probability table .................................................................................. 33 Table 4.1: Types of nodes in the Bayesian network of thermal power plant ........................ 51 Table 4.2: Initial probabilities of main components ............................................................. 52 Table 4.3: Initial probabilities of gauges............................................................................... 55 Table 4.4: Causal probabilities of cold water........................................................................ 57 Table 4.5: Causal probabilities of plate rotation ................................................................... 58 Table 4.6: Causal probabilities of pressurized water ............................................................ 58 Table 4.7: Causal probabilities of ignition ............................................................................ 58 Table 4.8: Causal probabilities of heat .................................................................................. 58 Table 4.9: Causal probabilities of pressurized steam ............................................................ 58 Table 4.10: Causal probabilities of turbine rotation.............................................................. 59 Table 4.11: Causal probabilities of pressure drop ................................................................. 59 Table 4.12: Causal probabilities of electricity ...................................................................... 59 Table 4.13: Causal probabilities of high voltage .................................................................. 59 Table 4.14: Causal probabilities of temp1 ............................................................................ 60 Table 4.15: Causal probabilities of temp2 ............................................................................ 60 Table 4.16: Causal probabilities of press1 ............................................................................ 61 Table 4.17: Causal probabilities of press2 ............................................................................ 61 Table 4.18: Causal probabilities of press3 ............................................................................ 61 Table 4.19: Causal probabilities of RPMM1 ........................................................................ 61 Table 4.20: Causal probabilities of RPMM2 ........................................................................ 62 Table 4.21: Causal probabilities of voltage........................................................................... 62 Table 4.22: MTTF of aging components and their reliabilities ............................................ 63 Table 4.23: Evaluating the working probabilities of variables under various evidences ...... 64 Table 4.24: Domain of variables ........................................................................................... 65 Table 5.1: Observable nodes and their related gauges .......................................................... 69 Table 6.1: SCP evaluation data at 0.5 threshold ................................................................... 79 Table 6.2: FEM evaluation data at 0.5 threshold .................................................................. 80 Table 6.3: FEL evaluation data at 0.5 threshold ................................................................... 81 Table 6.4: MCP evaluation data at 0.5 threshold .................................................................. 82 Table 6.5: SCP evaluation data at 0.75 threshold ................................................................. 83 Table 6.6: FEM evaluation data at 0.75 threshold ................................................................ 84 Table 6.7: FELevaluation data at 0.75 threshold .................................................................. 85 Table 6.8: MCP evaluation data at 0.75 threshold ................................................................ 86 Table 6.9: Summary of data evaluated by SCP, FEM and FEL at threshold of 0.5 and 0.75 ........................................................................................................................................ 88 Table 6.10: ANOVA test results for main replaceable components at threshold of 0.5 ....... 89 Table 6.11: ANOVA test results for main replaceable components at threshold of 0.75 ..... 89 Table 6.12: ANOVA test results gauges at threshold of 0.5. ................................................ 89 Table 6.13: ANOVA test results for gauges at threshold of 0.75 ......................................... 89 Table 6.14: Summary of MCP data evaluation at threshold 0.5 and 0.75 ............................ 90 Table 6.15: Predicted maintenance schedule using SCP,FEM and FEL aat threshold of 0.5 .......................................................................................................................................... 91 Table 6.16: Predicted maintenance schedule using SCP,FEM and FEL at threshold of 0.75 ........................................................................................................................................ 92 Table 6.14: Predicted maintenance schedule using MCP at threshold of 0.5 and 0.75 ........ 93 ABBREVIATIONS BN : Bayesian Network HBN : Hybrid Bayesian Network FT : Fault Tree DD : Decision Diagram DAG : Direct Acyclic Graph DBN : Dynamic Bayesian Network DFT : Dynamic Fault Tree DDD : Dynamic Decision Diagram GMM : Gaussian Mixture Models TTF : Time To Failure MDP : Markov Decision Process FMEA : Fault Mode and Effect Analysis MTTF : Mean Time to Failure RCM : Reliability Centered Maintenance TBN : Temporal Bayesian Network CTBN : Continuous Temporal Bayesian Network RBN : Recursive Bayesian Network DDN : Dynamic Decision Network HMM : Hidden Markov Model MC : Markov Chain DOOBN : Dynamic Object Oriented Bayesian Network HDBN : Hybrid Dynamic Bayesian Network PM : Preventive Maintenance PT and I : Predictive Maintenance and Inspection PGM : Probabilistic Graphical Model HVAC : Heating, Ventilation and Air Conditioning US : United State NASA : National Aeronautics and Space Administration BDD : Binary Decision Diagram DDD : Data Decision Diagram ID : Influence Diagram Dis : Distributor SP : Spark Plug BAT : Battery R : Radio ES : Engine Start AE : Age Exploration PG : Pressure Gauge TempG : Temperature Gauge Press : Pressure Temp : Temperature RPM : Tachometer (Revolution per Minute) RPMM : Tachometer reading RAMS : Reliability, Availability, Maintainability and Supportability CBM : Condition Based Maintenance RBM : Risk Based Maintenance CNC : Computerized Numerical Control RTF : Run To Failure RCFA : Root Cause Failure Analysis SCP : Single Critical Process MCP : Multiple Critical Process FEM : Fault Effect Myopic FEL : Fault Effect Look-ahead İ : Evidence Pr : Pressure F : Force A : Area Lc : Critical component threshold Lr : Main component threshold P : Probability. Oit : state of observable node i in time t. Git : state of gauge component j in time t. St : state of critical process in time t. Ckt : state of main component k in time t. X' : Average. ı : Standard deviation. 1. INTRODUCTION In the past decades, as the complexity of systems gets more and more, the need of methodologies to control the complex systems became a main field of study for researches in order to let decision maker analyze the situations accurately, and manage them. Many methodologies have been applied and experienced for static systems. Such as Bayesian networks (BNs), fault tree (FT) and decision diagrams (DDs). Many of the mentioned methodologies rely on the combination between probability theory and graph representation theory. In these, generally directed acyclic graphs (DAGs) are used to represent the system by the use of nodes and arrows, where nodes represent the variables in the system, and arrows define the relation between the nodes via conditional and transitional probabilities. In real life, most of the systems are dynamic, so the mentioned methodologies have been extended to take time into consideration. The extended methodologies are: Dynamic Bayesian networks (DBNs), Dynamic decision networks (DDNs), Dynamic fault tree (DFT). The main difference between static and dynamic systems is that in a dynamic system the concept of aging or degradation of components generally takes place. In other words, the states of the variables may change from one state to others according to probability distribution in time. A battery working state in a car or pressure gauge sensor working state in air conditioning systems can be given as examples of such systems. Maintenance is a very important aspect of controlling the system. It is one of the aspects that get a very high intention while systems become more complex. Many methodologies and strategies have been suggested and applied in many fields. The main maintenance activities are categorized in the way the action is taken. E.g. we have the reactive maintenance that is being applied only and only if a problem happens. Preventive maintenance is applied according to a schedule provided by the designer in order to make the component reach its designed life (planned maintenance). Predictive maintenance and inspections (PT and I) is applied where observations are collected and then analyzed to determine the states of the other variable. Proactive maintenance is a maintenance strategy 2 combines time based and condition based maintenance strategies together to minimize the probability of system breakdown. Reliability centered maintenance is applied by combining the above activities in order to maximize the reliability of the system, and minimizing the total down time by scheduling maintenance activities at earlier time. Energy conversion plants are the plants where electricity is produced by converting the energy from heat or potential energy into electricity. Energy conversion plants or electricity production plants are very complex systems involving many components interacting with each other, and affecting each other. In such plants, since satisfying the demand at the right time is critical, down time must be at minimum level. Or in other words, availability of the plants must be at maximum level. All types of maintenance activities (replace/ repair/ inspection) must be done together to achieve a minimum level of downtime of the plant. There are plenty of power generation plants. They can be classified according to the energy conversion principles used as follows: 1- Thermal power plants or Fossil fuel power plants. (E.g. Steam power plants, gas turbines). 2- Nuclear power plants. (Use nuclear energy generated in the reactor instead of fuel). 3- Geothermal power plants. (Use geothermal energy instead of fuel). 4- Hydraulic power plants. (E.g. Dams that use the potential energy of water). The main characteristics of each one of the mentioned four power plants are shown in Table 1.1. Thermal power plants are very old power generation plants. They first appeared in the industrial revolution in Europe at the late of 18th century. At that time, thermal power plants were using reciprocating engines. After a few years, turbines were invented, and they were considered as an alternative for the reciprocating engine, because of the following reasons: turbines provided higher speeds. They are compact machinery. They provided stable speed regulation allowing synchronizing between turbine operations and the operation of generators. 3 Thermal power plants are complex systems, with thousands of components and systems related to each other and affected by each other. Thermal power plants usually use reactive, preventive and inspection maintenance activities to control the processes and systems, and to determine the schedule of the maintenance of the plant. These schedules help to increase the availability of the plant. Normally the availability of the thermal power plants ranges from 70-90%. The reliability improvement mainly depends on the preventive maintenance by providing better designs. Most of the work done about thermal power plants uses preventive and reactive maintenance strategies. Table 1.1 Power plant types and their properties Thermal Power Plant Hydraulic Power Plant Geothermal Power Plant Nuclear Power Plant Fuel Coal-Gas- Diesel Flowing water Heat from Lava gaps Radioactive elements (U, Pa, etc.) Transportation of fuel Trucks(easy) Hydraulic Gates Heat transfer to water in the pipes Hard and need complex technology Location Could be placed any where On rivers/dams Next to the areas that contains Lava gaps Near water sources Environment Green house emission-global warming Disturbs fish habitant Green house emission-global warming (but less than thermal power plants) Nuclear waste Efficiency 30-55% (depends on the cycle) 80-94 % ( turbine efficiency) 20-40% ( depends on the temperature) 33% Energy share 70% 7% - 19% Table 1.1, shows a comparison between different types of power plants using different factors which are fuel, transportation of the fuel, location, environment effect, efficiency and energy share. Data is collected from the US energy information administration data sheets for the energy shares and efficiency of nuclear power plant. The efficiency of the 4 turbine in hydraulic power plants is collected from (Saadat 2011). Lastly the efficiency of geothermal power plant is collected from REPP-CREST: geothermal resources. Thermal power plants are the most widely used in power generation. This can be referred to many reasons, such as: 1- Relatively cheap fuel costs. 2- High efficiency (36-40 % for thermal power plant and 30 % for nuclear). 3- Less location constraints. 4- Availability of fuel is higher than other plants. In this thesis, we study scheduling of maintenance activities of thermal power plants in a given planning horizon. A thermal power plant is a facility that produces electrical energy from thermal energy. Normally it is operating 24 hours a day. In order to deal with this situation, we have to use predictive and preventive maintenance methodologies to prevent failure and minimize the down time. We propose a proactive maintenance methodology under limited observations in time with a predefined threshold policy. We use dynamic Bayesian networks (DBNs) to infer the observations and to calculate the reliability of the components of the system, and compare it with a predefined threshold. 1. STEPS OF THE THESIS The thesis has preceeded in the following steps: 1- Problem selection and definition. Maintenance scheduling of thermal power plant is selected as the problem. 2- Problem representation. The problem is represented using Bayesian networks and dynamic Bayesian networks. 3- Data collection. Initial reliabilities of components are collected. 4- Model development. Model is developed using Bayes Net Tool Box (Murphy 2002), and relations between components are defined. 5- Model verification: Model is verified using GENIE. 5 6- Development of solution methodologies. A proactive maintenance scheduling plans are developed. 7- Computational study. Using the solutions proposed results are collected. 8- Results and evaluation. The collected results are discussed and compared to other methodologies. 1. OUTLINE OF THE THESIS In Chapter 2, literature review about the work done through maintenance, reliability, Bayesian networks, and dynamic Bayesian networks, and main aims and methodologies are mentioned. In Chapter 3, definitions of main concepts used in the thesis are provided. The main concepts are: Reliability, Maintenance, maintenance types, probabilistic graphical methods, conditional probability, joint and marginal probabilities, reasoning types, Bayes rule, Bayesian networks, dynamic Bayesian networks and dynamic decision networks. In Chapter 4, problem definition of the topic is provided. Definition of thermal power plant, its main components, and assumptions made, interrelation between components and full description of each are provided. In Chapter 5, the proposed methodology is presented. This contains explanation of the method, steps used in the solution, algorithms used in the proposed methodology. In Chapter 6, experimental design and computational results of the proposed method are presented and comments on the method performance are provided. In chapter 7, gives the conclusions and future study directions of the thesis. 6 2. LITERATURE REVIEW The last decades witnessed a very wide interest in maintenance and troubleshooting of complex systems. Because the industry introduced a very complex systems with highly interacting components. Some aspects like reliability and availability are defined to measure the level of success of complex systems. Reliability means the probability of the desired system or component to work in their desired way. Availability is period of time in which the system is working in the desired way. This chapter covers review on the following subjects: Literature review over reliability and maintenance strategies, literature review over Bayesian networks (BNs), literature review over dynamic Bayesian networks (DBNs), and literature review over reliability and maintenance of thermal power plants. 2.1 RELIABILITY AND MAINTENANCE Reliability and maintenance are two essential concepts for all types of industry. Many strategies and approaches are suggested. The following review mentions some of the work done over reliability and maintenance. Chockie and Bjorkelo (1992) examine four organizations to evaluate system and component aging degradation. Four key elements of an effective maintenance program identified as the selection of critical components in the system, development and understanding of system aging through analysis of performance, development of suitable preventive (time-based) and predictive (condition-based) maintenance tasks to manage aging, also the feedback mechanism for continuous improvements. Lapa et al. (2000) aim to improve availability of nuclear power plant using an optimization of preventive maintenance plan. The optimization uses genetic algorithm and probabilistic safety. Genetic model holds unconstraint optimization that permits the variation in maintenance plans. Koca and Bilgic (2004) propose a new approach to perform troubleshooting with dependent actions. By defining dependent sets. Performance of this approach heuristic is tested against 7 other approaches, and they show that optimal troubleshooting sequence can be achieved. The methodology is made up on a fact that "efficiency of action is its priority ", but the methodology only observes 'dependent sets' not the whole system. Jesus et al. (2009) propose a method for evaluating the reliability and availability for gas turbines in electrical power stations. The method seemed to be suitable for interrelated components in systems. Since the method based on reliability concepts, it permits the identification of critical components for maintenance and defines quantitavly the system reliability and availability. They propose to use reliability centered maintenance to improve the reliability of the system and decrease unexpected failures of critical components. Failure mode and effect analysis (FMEA), mean time to failure (MTTF) and functional tree are used and applied to two gas turbines of type F series. Gomes and Gudwin (no date) present aspects related to the elaborating and implementation of an intelligent system, that works on predictive maintenance system (condition based maintenance) of hydro-electric power plant. The presented system (SIMPREAL) based on reliability centered maintenance (RCM). System contains two parts: First is the server part where the server collects data from sensors. Second is the client where java applet runs, and it is responsible for the presentation of a synoptic of the system to the user. Gupta (2011) develops a simulation model for evaluating the performance of a standard feed water system of thermal power plant using birth/death markov processes, and probabilistic approach. Model is built by drawing the transition diagram that is translated into differential equations. These differential equations can be solved recursively using probabilistic approach, to predict the steady state availability. Then build the simulation model. Availability matrix is made up after the simulation model is built up, and then plots of failure/repair rates are plotted of all subsystems. Özgür-hQODNÕQ and Bilgiç (2012) propose a methodology for reliability centered maintenance scheduling. They aim to minimize number of replacements by maintaining the system reliability over a predetermined value. The problem is represented with DBNs. Four approaches are used for selecting the components to replace. The approaches are: Fault 8 effect myopic approach, fault effect look-ahead approach, replacements effect myopic approach and replacement effect look-ahead approach. 2.2 BAYESIAN NETWORKS Many methods are developed to measure the reliability of a system such as Fault tree (FT), Bayesian Networks (BNs), where BNs, gain the interest because of its ability to represent complex systems in a compact and easy way. Variables are represented with nodes, and the interrelation between the variables is represented with an arc between them. BNs are combinations between graph and probability theory. The arcs between the variables carry the conditional probability between the connected variables. Here is a review of some works done over BNs: Heckerman et al. (1995) introduce a new approach to decision theoretical troubleshooting approach. Aim to determine troubleshooting plans under uncertainty that provides observations and repairing actions. The approach is tested in printing machines, automobile startup problems, photocopier feeder system and gas turbines. Results obtained show that the generated plans are close to the optimal ones. Monte Carlo technique is used to estimate troubleshooting costs, and interrelations between system components represented by BNs. Richiardi et al. (2005) use BNs to estimate the probability for verification errors (fails). Given the Gaussian mixture models (GMM) based speaker verification system output and additional information about acoustic noise, then reliability of the system can be evaluated. Bai (2005) develops an extended Markov Bayesian networks model, and propose an algorithm for solving complex software systems. The development process focuses on discrete time failure data. The model is developed in order to deal with the fact that many of the software systems often depend on specific operation applied. Operational profile is defined as: "The profile that consists of all operations in the system, that the system is designed to perform and their probability of occurrence. It provides quantities characterization of how the system will operate in the field". 9 Kryszczuk et al. (2005) use Bayesian network to represent the decision and information in multimodal biometric system. Bayesian network is used to present a framework to predict and correct decision errors. Prediction and correction of decision errors are based on modality reliability measures. Boudali and Dugan (2006) aim to find alternative formalism for modeling and analyzing large dynamic systems, and to identify the problems and issues related to the available techniques (Dynamic Fault Tree (DFT), Bayesian Networks (BNs), Temporal Bayesian networks (TBN)). The proposed method is continuous time Bayesian networks (CTBN). The techniques are compared according to their: 1. Modeling Power. 2. Ease of determining the model. 3. Computational Efficiency. Marquez et al. (2007) aim to solve any configuration of static and dynamic gates. Propose a methodology using general parametric or empirical time to failure (TTF) distributions. The method proposes a hybrid Bayesian network (HBN) model to solve the configuration between the gates without using numerical integration or simulation methods. Casini et al. (2011) introduce recursive Bayesian networks (RBNs), and apply it to mechanisms. RBN formalism maintains and provides an integrated modeling formalization for predicting explanations and control. The formalization can be applied to model cancer mechanisms where hierarchy is ubiquitous and vast amount of data is available. Murphy (2001) defines BNT (Bayes net tool box) as a designed open source package. BNT works in Matlab environment. It is used to deal with BNs and DBNs. BNT supports many probability distribution functions. It also supports both exact and approximate inferences. 2.3 DYNAMIC BAYESIAN NETWROKS In real life, not all of the systems are static. Or in general we are also interested in not only on the current state of the system, but also in the future state of the system. Using age exploration and prognosis inspect and prediction principles, we deal with the dynamic 10 systems. Many methodologies are proposed, such as: Dynamic Fault Tree (DFT), dynamic decision networks (DDNs), hidden Markov Models (HMMs), and Dynamic Bayesian networks (DBNs) where Bayesian networks (BNs) are extended to a dynamic system by taking time into consideration using transition probabilities, that define the future reliability of the system or components. Kuenzer et al. (2001) evaluate 6 models to understand which one has the best fit for modeling the predictions of rule based interaction behavior for real domain. These models are: Hidden Markov Models, auto regressive hidden Markov model, factorial hidden Markov model, simple hierarchical hidden Markov model, Markov chain of order and tree structure hidden Markov model. Weber and Jouffe (2003) propose a method based on dynamic Bayesian networks (DBNs), that easily allows construction of dynamic Bayesian networks (DBNs) structure. For modeling the temporal evolution of complex systems, correspondence between Markov chain (MC), Fault tree (FT), and Dynamic Bayesian networks (DBNs), are presented and applied in order to estimate system reliability. Weber et al. (2004) use DBNs to model the dependability in complex systems with degradations and failure modes, countered by exogenous constraints. They propose a method allows modeling DBNs structures to be used in modeling temporal evolution of complex systems. Montami et al. (2005) propose to describe dynamic gates within the Dynamic Bayesian network (DBN), by translating all the basic dynamic gates into corresponding DBN model. The approach is tested on a complex example taken from the literature. the experimental results show how DBN can be safely used if a quantitative analysis of the system is required. the aproach is able to improve both the modeling and the analysis capabilities of classical fault tree FT approaches, by representing more general simplified dependencies and by performing general inference on the resulting dynamic model. BenSalem et al. (2006) represent a method for modeling and analyzing the reliability of interrelated components in a system, based on DBNs. The influence of time or exogenous 11 variables can be considered on the failure (degradation) of the system. The method allows designing DBN structure for modeling temporal evolution of complex system. The correspondence between Markov chains (MC) and DBNs is represented and applied to the system reliability. Weber and Jouffe (2006) present a methodology that can be used to develop dynamic Object Oriented Bayesian Networks (DOOBNs). The methodology aims to get a general reliability evaluation in manufacturing systems. The presented method allows easily built up DOOBNs structures, to model temporal behaviors of probabilities of complex system states. Özgür-hQODNÕQ and Bilgiç (2006) try to optimize maintenance activity of a system where components age with constant failure rate together with a budget constraint, and try to optimize it by developing a predictive maintenance plan using DBNs. They first represent the as an optimization problem where mathematical model is used. Because of the complexity of the solution, representation they propose DBN for fast inference under some simplifying assumptions. The objective is to minimize total maintenance cost in a planning horizon such that the reliability of the system never goes down a predetermined threshold value, and maintenance budget is never exceeded. For replacement decision making they propose two approaches: myopic and look ahead. Neil et al. (2009) describe the use of hybrid dynamic Bayesian networks (HDBNs) in order to model risk of operations in the financial institutions in terms of economic capital. Methodology models the losses resulting from international events or sudden accidental events, and characterizes them by their ability to evade controls and which ultimately lead to increasingly severe financial consequences. Their Model focuses on cause and effect of losses of events using DBNs. It is argued that BNs are natural choice and powerful tool for modeling operational problems. They present generalized approach using HDBNs that successfully represent and model dependencies. Özgür-hQODNÕQ and Bilgiç (2010) study dynamic systems having several components, where components can be partially observed via indirect signals. They model the problem as POMDP. The POMDP problem is aggregated in terms of states and actions so it can be 12 optimally solved. The policy of the aggregated POMDP is disaggregated by simulating with DBNs. Varuttamaseni et al. (2011) proposes dynamic Bayesian networks as an alternative of Markov chain. conditional dependencies in DBN simplifies the factorization function of joint probability. they analyze feed and bleed in nuclear power systems using DBN. the analysis leads to understand and evaluate the risks and their effects over nuclear power system. 2.4 RELIABILITY AND MAINTAINABILITY OF THERMAL POWER PLANTS Because of the importance role that thermal power plant plays in the daily life of people, many works are done. Many approaches and strategies proposed. In this review, some of these works are mentioned. Sergaki and Kalaitzakis (2002) propose a data base model that ensures the representation of the fuzzy information, and also ensures the handling of this information. The model provides additionally the functionality that allows the user to set the accuracy degree for conditions involved in the operation (situation), and then prepare the maintenance plan of the system. This model is applied to a case study to a power plant Krishnasamy et al. (2005) propose a risk based maintenance (RBM) strategy to analyze and determine maintenance plan. In the strategy, four principles are used. The principles are: 1- Scope identification (dividing the big system to smaller systems and subsystems, then analyzing each system and subsystem alone, while collecting data from each and define potential failure scenarios). 2- Risk assessment (defines the results of each possible failure mode). 3- Risk evaluation (used to decide about the risk if it is in the acceptable range or not). 4- Maintenance plan (if the risk evaluation decided on is not to accept the risk then maintenance plan is scheduled). 13 The methodology applied successfully to Holy rood thermal power plant. Eti et al. (2007) try to integrate reliability, availability, maintainability and supportability (RAMS), with risk analysis, in order to increase the plant availability. And improve the reliability of the plant. They determine the maintenance schedule using preventive maintenance and inspection. Risk analysis is used to decide about the action needed. If the risk is not acceptable, maintenance activities are scheduled. Ibargüengoytia and Flores (2009) propose three methods using intelligent probabilistic methods to help operators of thermal over plants. Because thermal power plant system is a large, complex and influenced by unexpected events, together with uncertainty. They present planning and decision support system, and diagnosis. The diagnosis based on qualitative probabilistic methods. Decision support system uses influence diagrams to show the relations between components. The three methods are: 1- Probabilistic model over qualitative changes of variable applications, which shows how probabilistic diagnosis can be carried out utilizing common expert criteria. 2- Utilizes influence diagrams that uses probabilistic reasoning and decision theory technique. 3- Planning a system uses formalism of Markov decision processes (MDPs), which provides a powerful frame work for solving sequential decision problems under uncertainty. Rusin and Wajacazek (2012) propose mathematical model of selection the time between maintenance activities that considers economic effects and risk levels regarded to operation of components. Model used to estimate time between over-hauls of steam turbines, that is been identified as one of the main component in the power system. The evaluation of the model is done under four scenarios consideration. 14 3. BACKGROUND In this chapter, definitions and explanation of the principles used in this thesis are given. Like Bayesian networks (BNs), dynamic Bayesian networks (DBNs), maintenance, reliability, Bayes theory, reliability centered maintenance (RCM), reactive maintenance, preventive maintenance (PM), and predictive maintenance and inspection (PT and I), proactive maintenance, and also it contains dynamic decision networks, probabilistic graphical methods (PGM). 3.1 RELIABILITY AND MAINTENANCE Reliability and maintenance are very important concepts for system availability and to maintain the system in its working state. In the following section, definitions and related characteristics related to reliability and maintenance are given. 3.1.1 Reliability There are many definitions of reliability in the literature, which are nearly similar. We will provide the basic ones; Reliability is the ability that a device will perform its required operations successfully and in the intended way without failure (Ushakov 2012). Reliability engineering provides the theoretical and practical tools required to determine the probability and capability of systems/components, to perform their intended jobs in the intended way, for a specified period of time, in a specified work environment without failure. Then the reliability of the systems/components can be specified and predicted (0LOXWLQRYLü and /XþDQLQ 2005). Reliability is an inherent feature of the design, which is concerned about the performance of the system or components in the fields. In the last years, many methodologies and suggestions based on reliability principle have been developed. The importance of reliability can be expressed as follows: whenever the system reliability in the acceptable ranges, that means no problem. But when it falls under the acceptable range, it requires a 15 corrective action. Since reliability can be predicted; it became a key factor in dynamic system states predictions. 3.1.2 Maintenance Maintenance is the total of activities required to retain the system or restore it to the state necessary to fulfillment of the production function (C.W.Gits 1992). Maintenance and its principles have been a challenge in the field since the industrial revolution. It is applied impressively in the field, but it is still a big challenge due to lots of reasons and factors such as: size, cost, complexity (Dhillon 2002). Maintenance strategies change along with the underlying philosophies used. For example, the time of acting is an important factor. Shall we act after the problem occurs or before? The following strategies are applied in the field: 1- Reactive maintenance(Run to failure RTF) 2- Preventive maintenance (PM) or Time based maintenance. 3- Predictive maintenance and Inspection (PT and I) or Condition based maintenance (CBM). 4- Proactive maintenance or Root causes elimination maintenance. 5- Reliability Centered Maintenance (RCM). Engineering maintenance is the activity of system or component maintenance that develops principles, concepts and technical data in conceptual way. Then these data and concepts can be used and maintained in the current operating state, to ensure the best effective maintenance support. Engineering maintenance uses a maintenance plan. This is a statement that contrasts management and technical procedure to be applied to maintain the system or component. Then maintainability aspect is defined as the probability that a failed item will be restored to the proper working conditions. Taxonomy of maintenance strategies is shown in Figure 3.1. In the following sections, detailed definitions are provided for each of the strategies mentioned above. 16 Figure 3.1: Taxonomy of maintenance strategies Source: Taxonomy of maintenance concepts (redrawn after Kothamasu et al. (2006)). 3.1.2.1 Reactive Maintenance Reactive maintenance is run to failure maintenance (RTF). It is the traditional maintenance type which means we take no action until a problem takes place in the system. No action is been taken to maintain the system unless a failure is occurred. In some literature it is called corrective maintenance. It is used to correct system or component that does not perform in its desired way. Maintenance strategies Proactive Planned Predictive RCM Condition Based Preventive Constant Interval Age Based Imperfect Reactive/cor rective unplanned Corrective Emergency 17 There are many types of reactive maintenance as shown in Figure 3.2, they are categorized as: a) Fault-Repair: in this category, systems/ components are restored to its operational state after the failure occurred. b) Salvage: Disposal and use of salvage material from non-repairable component/system. c) Rebuild: restores the component/system to as close as possible to its original state. d) Overhaul: uses 'inspect and repair only as appropriate', restores the component/system to its total serviceable state. e) Servicing: by providing services to components/systems after repairs such as welding or recharging. Figure 3.2: Categories of reactive maintenance Source: Categories of Reactive Maintenance. (Dhillon 2002) In order to apply the reactive maintenance strategy, the following steps are used: 1- Fault recognition. 2- Localization of the failure. 3- Diagnosis and evaluating the failure. categories of reactive maintenance Fault- Repair Overhaul Rebuild Salvage Servicing 18 4- Repair and take the required actions. 5- Check out. This strategy has few advantages, which are as follows: a) Low Cost for maintenance actions. b) Minimum experienced staff. c) Also some disadvantages are related to this maintenance strategy: d) Increase cost due to unplanned downtime. e) Increased labor cost (over time when downtime occurs). f) Cost involved with repair or replacement. g) Possible secondary system or process damage from system failure. h) Insufficient use of staff resources. Reactive maintenance has the ability to be applied everywhere in every industry. Or in other words it can be applied whenever an unplanned downtime occurs in any system. 3.1.2.2 Preventive Maintenance Preventive maintenance (PM) or Time-Based maintenance is a prescheduled maintenance; used to maintain the system in its working state, and helps the system reaches its designed life. Or in other words, it is the actions performed on a system based on a schedule that detect component aging (degradation) of a component or system with the aim of maintaining or extending the useful time of the system through controlling component aging (degradation) to an acceptable level. The Objectives of PM are concentrated on enhancing component productive life. And it allows reducing critical components breakdown. Also it allows Planning better maintenance schedules and plans that reduce the downtime. And it allows Minimizing production loses due to component failures (less down time means more production). PM strategy consists of seven elements that are integrated with each other to develop the maintenance plan as shown in Figure 3.3. These elements are: 19 a) Inspection: periodic observation to check and compare serviceability to its standard. b) Servicing: by the mean of providing services to the component/system, such as: OXEULFDWLRQ FOHDQLQJ«HWF. WKDW ZLOO KHOS WR SUHYHQW IDLOXUH RFFXUUHQFHV. c) Calibration: comparing component/system characteristics to its standards. d) Testing: periodic tests that are used to determine the degradation of component/system. e) Alignment: changes done to the component/system in order to optimize performance. f) Adjustment: adjusting component/system in order to optimize performance. g) Installation: replacements of the limited life components. Figure 3.3: Elements of preventive maintenance Source: Elements of PM. (Dhillon 2002) To apply PM strategy, the following steps must be followed: 1- Identifying and choosing the areas: identify and choose the areas where PM is going to be applied. 2- Identifying PM needs and requirements. 3- Setup the frequency of the assignment. 4- Prepare PM assignment. Elements of PM Servicing Installation Test Calibration Alignment Adjustment Inspection 20 5- Schedule PM on annual basis. 6- Expand PM program as necessary. Preventive maintenance has many advantages, these advantages makes this strategy an important strategy in maintenance. These advantages are as follows: i. Cost effective. ii. Component/system performs at convenient environment. iii. Balanced work load. iv. Increase in production revenue. v. Flexibility allows for the adjustments of maintenance schedule. vi. Consistency in quality. vii. Increased component life time. viii. Energy saving. ix. Reduced system failure or downtime. x. Estimates that it has sufficient cost saving over reactive maintenance. The disadvantages of the preventive maintenance are: i. Failures still likely to occur. ii. Labor intensive. iii. Includes performance of unneeded maintenance. iv. Potential for incidental damage to components in conducting unneeded maintenance. Preventive maintenance is applied in many industry areas such as: i. US Marine. ii. Cars and automobile. iii. Lubricated systems. iv. Calibrated systems. v. Everywhere when failure pattern is known. vi. Battery inspections. 21 3.1.2.3 Predictive Maintenance and Inspection Predictive maintenance and Inspection (PT&I) or Condition-Based maintenance is the maintenance actions taken by applying inspection and diagnosis to the system and analyzing the system condition to decide on the actions needed. The diagnosis process is done by collecting measurements that detects system degradation (lower functional state). PT&I allow the elimination of failure due to aging, and allowing the system to be controlled prior significant deterioration in the component state. Results indicate current and future functional capabilities. The advantages of the predictive maintenance and Inspection are as follows: i. Increases components availability. ii. Increases components life. iii. Allows for preemptive corrective. iv. Decreases in system failure. v. Decrease in the cost of labor and components. vi. Better final product quality. vii. Improve the safety of workers and environment. viii. Improve workers morale. ix. Energy saving. x. Sufficient cost saving to preventive maintenance. The disadvantages of the predictive maintenance and Inspection are: i. Increase investment in diagnosis (observations) components. ii. Increase investment in staff training. iii. Saving potential not readily seen by the management. iv. Increase the need of expertise stuff. 22 Predictive maintenance and Inspection has been applied in different areas, where complex systems appear, such as: 1. US navy. 2. NASA technologies. 3. Electrical Surveys. 4. Mechanical Surveys. 5. HVAC systems. 6. Building Surveys. 7. Energy industry. 3.1.2.4 Proactive Maintenance (Root Cause Maintenance) Proactive Maintenance or root cause maintenance improves maintenance through better design installation, maintenance activities, workmanship and scheduling. It combines predictive maintenance and inspection with preventive maintenance. Proactive maintenance improves maintenance tasks by providing a better design, and improving workmanship, also by improving the installation procedure, and by providing a maintenance schedule, and by improving the maintenance procedure. There are many methods that proactive maintenance relies on in order to extend the life of components/systems as shown in Figure 3.4. These concepts are: a) Reliability engineering: contains redesign, modification and improvement of components/systems, and improvement of the repairs. b) Failed item analysis: inspecting and testing the failed component/system in order to define the causes of the failure. c) Root cause failure analysis (RCFA): determining basic (root) cause of the failure of the component/system. d) Specifications for new/rebuild component/system: recording the historical data involved in the functioning of the components/systems. e) Age exploration (AE): tries to determine degradation of components/systems using three factors: Technical content, Performance interval and Task categorizing. 23 f) Rebuild certification/authentication: to verify that all installation works properly. g) Recurrence control: it is related to the repetitive failures. h) Precision Rebuild and installation. By improving the design to improve the system. Also by improving the installation procedure. Figure 3.4: Proactive maintenance methods used to extend the life of components Source: Elements of Proactive maintenance (Dhillon 2002). Advantages of proactive maintenance are: i. Better component/system design by the use of root cause data. ii. High availability. iii. High reliability. iv. Data base generated from feedback. v. Optimized maintenance plan. vi. Continuous improvements. Proactive maintenance methods Age Exploration Root Cause Failure Analysis Specification of new/rebuilt component Rebuild Certification/ authentication Reliability engineering Reccurence control Precision rebuild and installation Failed Item analysis 24 vii. Long component/system life. Disadvantages of proactive maintenance are: i. Cost savings are not clearly seen by the management. ii. High demand of trained personnel. iii. Establishing cost. Proactive maintenance is applied in many areas such as: a) Controlling Lubricant fluids. b) Controlling Hydraulic fluids. c) Controlling Coolants. d) Controlling Air. e) Controlling Fuel. f) HVAC systems. 3.1.2.5 Reliability Centered Maintenance Reliability centered maintenance (RCM), integrates Preventive Maintenance, Predictive Testing and Inspection, Reactive Maintenance, and Proactive Maintenance to increase the probability that a machine or component will function in the required manner over its design life-cycle, with a minimum amount of maintenance and downtime. It is defined as the process used to determine the maintenance requirements of any physical states of the system variables to operate in its desired way, by integrating PT&I actions with PM actions. It focuses PM activities on the most likely failure modes. Goals and objectives of RCM are as follows: 1- To setup system with associated priorities, that can help PM to order the focusing action. 2- To collect feedback that will be used in future improvement of the system. 3- To setup PM tasks in a way that they can bring the system components to their working states when deterioration detected. 4- The last goal is to achieve all goals at minimum cost. 25 RCM is a combination of many principles that are integrated with each other. RCM is driven by safety and economics. RCM is a function based strategy that focuses on the improvements of system functions. Design limitations are covered by RCM. RCM is a live system where the age of the system is an important factor. Unsatisfactory condition is defined as failure in RCM. RCM contains three maintenance strategies (failure finding, time based maintenance, condition based maintenance). These tasks must be effectively applied and must be applicable. Steps of applying RCM: 1- Identify the important components with respect to maintenance. 2- Collect accurate failure data. 3- Setup Fault Tree analysis data. 4- Use decision methods to decide about critical failure modes. 5- Classify maintenance demands and requirements. 6- Apply RCM decisions. 7- Collect feedback and apply sustaining engineering to the system. RCM is made up from four components as shown in Figure 3.5. These components are Reactive Maintenance (corrective maintenance) that is applied whenever a failure takes place in the system. Preventive maintenance (time based maintenance) that is applied according to a predefined schedule. Predictive Testing and Inspection (Condition based), that is applied by inspecting the system components and collecting observations to decide about the conditions of the system. Proactive maintenance is applied through providing better designs and installations; using failure and root cause data. 26 Figure 3.5: Components of RCM Source: (Dhillon 2002) Advantages of Reliability centered maintenance are: i. The most efficient maintenance program. ii. Lower costs by eliminating unnecessary maintenance or overhauls. iii. Minimizes frequency of overhauls. iv. Reduces probability of sudden system failure. v. Able to focus maintenance activities on critical components. vi. Increases component/system reliability. vii. Incorporate root cause analysis. viii. Improve safety. ix. Improve production quality. x. Extend life of costly components/systems. xi. Increases component/system availability. xii. Generates database from feedback collection. Disadvantages of the reliability centered maintenance are: i. Can have a significant startup cost. ii. Trained stuff is needed. iii. Savings potential not readily seen by management. RCM components Preventive maintenance Proactive Maintenance Predictive technology and inspection Reactive maintenance 27 Reliability centered maintenance, has been applied to many industrial areas like: a) Aviation Industry. b) Spacecraft Industry. c) Nuclear Industry. d) US Military. e) Medical services. Application fields of the mentioned maintenance strategies are summarized in Table 3.1. 28 Table 3.1: Applications of maintenance strategies: Strategy Applications Reactive maintenance  Industry.  Factories.  Everywhere where unplanned failure takes place. Preventive maintenance  US Marine.  Cars and automobile.  Lubricated systems.  Calibrated systems.  Everywhere when failure pattern is known.  Battery inspections. Predictive maintenance  US Marine.  NASA technologies.  Electrical Surveys.  Mechanical Surveys.  HVAC systems.  Building Surveys.  Energy industry. Proactive maintenance  Controlling Lubricant fluids.  Controlling Hydraulic fluids.  Controlling Coolants.  Controlling Air.  Controlling Fuel.  HVAC systems. Reliability Centered Maintenance  Aviation Industry.  Spacecraft Industry.  Nuclear Industry.  US Military.  Medical services. 29 3.1.2.6 Diagnosis and Prognosis This principle comes from the health monitoring activities. The main difference between the two concepts is the aimed time. In diagnosis mainly it aims to define the states at the moment, while prognosis tries to identify the states in future. Table 3.2 states the differences between diagnosis and prognosis. Diagnosis; is the process of monitoring the system, collecting observation, and determining the current states of the related variables, using the available observations. It is an important action that can help to detect the fault once it happens. Then it takes the required actions to prevent or decrease the down time, and keep the system reliability on the acceptable range. It is used within the static systems. When an event happens, causes of it are searched. Prognosis; is the process of estimating and predicting the future states of the variables in the future time slices depending on the transition probability that describes aging (degradation) of the variables (components) in the system. It is an important action that helps to predict the maintenance schedule which will minimize the downtime, and also helps to maintain the reliability of the system. It is used within the dynamic systems such that we predict the future states according to the present ones. Table 3.2: Differences between Diagnosis and Prognosis Diagnosis Prognosis Time Current. Future. Method Uses observation to determine the current states of the variables. Uses aging of the variable to estimate the future states of the variables. Use Within static system (at same time slice). With dynamic systems (different time slices). 30 3.2 PROBABILISTIC GRAPHICAL MODELS Probabilistic Graphical Models (PGMs) is a combination between graph and probability theory. It is a graphical representation of nodes and arrows. Nodes represent the variables. Arrows (links) represent the relationship between the linked nodes and they carry they information. It is a representation of the joint probability of many variables via the use of conditional probability (Figure 3.6). Probabilistic Graphical Methods are compact and user friendly tools to represent complex systems and do a complex decision making. Figure 3.6: Probabilistic Graphical representations Source: (Koller and Freidman) Probabilistic Graphical Methods (PGMs) gain importance in engineering because it combines the graph and probability theory. Graph theory is used for representation, while probability theory is use for decision making. Murphy (2002) classifies PGMs as:  Directed graphs: contains directions (arrows). Bayesian Networks (BNs): are directed graphs which express the casual relation between variables. They are popular in statistics.  Undirected graphs: contains only links with no directions. Markov random filed: is an undirected graph which is used to express soft constraints between the variables in the system. They are popular in physics and vision. 31 3.2.1 Basic Terminologies PGMs benefits from various terminologies in probability theory such as conditional probability, and independencies and Bayes theorem. In this section, description of basic terminologies is given. 3.2.1.1 Conditional Probability Conditional probability in brief, is the probability of event A to occur relative to event is, mathematically: =  3 %( •  (3.1) Conditional probability helps us by giving us a way to decide about the results of an experiments based on some partial information (evidence). Especially in sequential experiments when the probability of an event depends on another. 3.2.1.2 Joint probability Joint probability is the probability that two events A and B happen at the same time (simultaneously). Suppose X and Y are the WZR UDQGRP YDULDEOHV RQ WKH VDPSOH VSDFH Ÿ RI DQ H[SHULPHQW then the joint probability of X and Y is: , joint probability is also expressed as or . Joint probability is used to find out how likely that two or more event will happen at the same time. 32 3.2.1.3 Marginal probability Marginal probability is the probability of occurrence of a single event. Marginal probability calculated as: ∑ For each possible value of x (3.2) ∑ For each possible value of y (3.3) 3.2.1.4 Independent Events Two events are independent if occurrence probability of one event GRHVQ¶W UHODWH WR the other event. An event is said to be independent if and only if | (3.4) | (3.5) 3..5 Transition Probability Transition Probability is the probability of change in variable from one state to another when time changes from a time slice to the future time slice. It is used in the dynamical graphical systems, such as dynamic Bayesian networks and dynamic decision networks. For random variable Xn where n denotes the time slice, transition probability can be expressed as: (3.9) For different possible state values i and j that Xn can take. Transition probability is a main concept in dynamic system since it explains the relation between the variable and itself in the next time slices. 33 Figure 3.7: Transition probability Table 3.4, Transition Probability Table X1-Fail X1-Work X2-Fail 1 0.3 X2-Work 0 0.7 For a dynamic system given in Figure 3.7, transition probabilities are given in Table 3.4. In Figure 3.7, X1 is the variable X at time slice 1, while X2 is the same variable X at time slice 2, the link between X1 & X2 is the information link that carries the transitional probabilities between X1 and X2. And from table 3.4 , it says that if X1 state is Fail, that means it will be fail for X2 with probability of 1, and if the state of X1 is work, then the probability that X2 will work is 0.7. 3.2.1.6 Bayes Rule Bayes rule aims to make a mathematical method that can be used to update the existing expectation using the new (updated) evidence. Bayes rule is mainly used for inference by using the conjunction of total probability method and the reverse conditional relation between events. Let be disjoint events that form a partition in sample space Ÿ. $lso assume that , then for any event B that we get: X1 1 R1 X2 R2 t=1 t=2 34 | (3.10) | | | | (3.11) Where the first equality comes from the conditional probability and the second equality comes from the total probability theory. In experiments, we are interested in some number of causes that result in an effect, that we observe the effect and use it to infer to the cause. Let event's denote the causes and event B denote the effect. Then the probability is the conditional probability that the effect B will be observed given the cause . Given that the effect B has been observed, then we can use this information as evidence to calculate (infer) the probability of the cause . Here the probability called a posterior probability and the probability called a prior probability. 3.2.2 Decision Diagrams Decision diagrams (DDs) is a generalized form of the decision tree. It is a directed acyclic graph (DAG) that represents the data structure (state spaces) of the complex systems, it contains two node types. These are: Chance nodes describe "AND". Choice nodes describe "OR".  And each graph has only one root. Two main types of these graphs are: Binary Decision Diagrams (BDDs) and Data Decision Diagrams (DDDs). 35 3.2.2.1 Influence Diagrams Influence Diagrams (IDs) are decision diagrams. In influence diagrams three types of nodes are used and two types of arrows are combined. Node types: 1- 'HFLVLRQ QRGH UHSUHVHQWHG E\ D VTXDUH Ŷ. 2- &KDQFH HYHQW UHSUHVHQWHG E\ D FLUFOH ż. 3- Value node represented by a rounded rectangle. Arrows Types: 1- Solid Arrow: if it points to a value or chance node. 2- Dashed Arrow: if it points to a decision node. A typical ID is given in Figure 3.8. The node at the beginning of the arrow is called a predecessor. The node at the end of the arrow is called a successor. In IDs, No cycling is allowed since they are directed acyclic graphs (DAGs). And IDs are not flow charts. Figure 3.8: Influence Diagram Decision Node Chance Node Value Node 36 3.2.2.2 Dynamic Decision Networks A dynamic decision networks (DDNs) is an extension of the decision networks that represent how the system changes over time and models general sequential decision making. Dynamic sequencing of the decision is represented by the links; the single chance node X is the process node and determines the utility. The node O is the observation node that the evidence will be added to before each subsequent decision, the sequencing is represented by the information link. The networks are used to maximize the utility U, in the end of the time steps. A DDN example is given in Figure 3.9 where D is the decision node at each decision epoch. Figure 3.9: Dynamic decision networks Source: (Korb & Nicholson, 2004) 3.2.3 Bayesian Networks Bayesian Networks BNs and dynamic Bayesian networks are very helpful to represent the dependencies between system components. In the following sections; definitions and characteristics of BNs and DBNs and how to infer in DBN are given. 3.2.3.1 Static Bayesian Networks Bayesian networks (BNs) is directed acyclic graphs. In which the nodes define (represents) the variables of interest or components (e.g. pressure of a device, gender of a patient, occurrence of an event), and the arrows (links) represent (define) the informational or casual dependencies between the variables. The strength of dependency between the nodes ௡ ௡ U ܦ ௡ ܦ ܦ ܦ 37 (variables of interest) is represented by conditional probabilities that are attached to each link that links the parent-child nodes in the network. BNs gain importance since they are considered as compact representation of complex systems that can show the relation between the components of the system in a clear way (Pearl 1985). A Bayesian network example is given in Figure 3.10. This is a simplified BN of well- known automobile diagnosis problem. Figure 3.10: Bayesian Network A Bayesian Networks is a convenient and easy tool to classify the nodes as parent node, or child node. A parent node is a node with one or more arcs (links) originated from it and directed towards another node (child node). A child node is a node with one or more arcs (links) towards to itself coming from other nodes (parents). A node without a child is called a leaf node. The links in Bayesian networks carry the conditional probabilities that define the relation between the linked nodes. In the Figure 3.10, all of (Distributor, Spark plug and Battery) are parents. And both of Radio and Engine start are children and in the same time leaf nodes. BNs are directed acyclic graphs. They take Joint probability into consideration. They can deal with both discrete and continuous random variables. 38 BNs use the following properties: 1- Evidential Reasoning using the given data as evidence. The rest of nodes are evaluated according to the available information using the joint probability as in equation 3.12. ∏ (3.12) Where i. Pai defines the conditional probability of Xi and it defines the child. ii. x1, x2 « [n define the set of predecessors of Xi (parent). 2- Reasoning about actions is one of the most powerful properties of BNs. When any change or modification happens on the network, it can be easily modified, and reasoning about modification action can be achieved. There are many ways of reasoning in Bayesian networks (BNs), see Figure 3.11. 1- Diagnosis: collecting observation to determine the current states. 2- Prognosis (predictive): uses the current states as evidences to determine (predict) the future states. Figure 3.11: a- Diagnosis Reasoning, b-Prognosis (predictive) reasoning 39 3.2.3.2 Dynamic Bayesian Networks A dynamic Bayesian networks (DBN) is an extension of a Bayesian network, where the system states change through time. It is a compact representation that describes the system states and their changes through time. It decreases the combinational explosion effect while measuring the reliability of the system. DBNs can be considered as helpful tools to deal with variable (component) aging (degradation) by using transition probabilities (Murphy 2002, Athi 2011, Weber 2003). An example of a DBN is given in Figure 3.12. This is a dynamic system generated from the static system given in Figure 3.10 where components Distributor, Spark plug and Battery age through time. As DBNs are compact structures of dynamic systems, they can be considered as tools that are able to deal with aging and maintenance activities. DBNs are applied in many filed such as robotics, data mining applications, speech recognition, digital forensics, protein sequencing and bioinformatics. DBNs are power full tools to predict the future reliability of systems. Especially in complex systems; with DBNs maintenance plan and prediction is easier. Hence the early predictions allow preparing to be ready on time (e.g. ask for spare parts earlier which will make the parts ready when needed). The easiest way to deal with dynamic Bayesian networks (DBNs), is to develop stationary Bayesian networks in each time slice, and define the relation and dependencies inside Bayesian network (BN), then connect the dynamic nodes with themselves in the next time slices. Where is the ith node at time t, can be , represent the parents of . 40 Figure 3.12: Dynamic Bayesian Networks (DBNs) 3.2.3.3 Inference in Dynamic Bayesian networks Inference is a computational method for evaluating a query given in a probability model expressed by BN (D'Ambrosio 1999). Inference can be made to evaluate the state of any variable given the evidence of the state of other variables (Needham 2006). Inference Uses Bayes theorem and simplified by conditional independence. Inference can be either exact or approximate. Exact inference is used when the analytical form of the problem is available, and the computationally state is feasible. It can be used with directed acyclic graphs. Approximate inference is used when the analytical form of the problem is not available or when the time needed for the exact solution is very long. In approximate inference, intractability is important and there are three ways to handle it: Sampling by using Monte Carlo methods. Variation Methods are the simplest methods is the mean filed approximation method. Belief propagation (PB) entails the message passing algorithm to the initial form. 41 4. PROBLEM DEFINITION AND MODELING Thermal power plant is a complex system having man interrelated components. Fixing maintenance plans for thermal power plants needs solutions. Methodologies to schedule a proactive maintenance scheduls using DBNs are proposed. The methodologies aim to minimize the main component repairs in the planing horizon. We assume that the system has a discrete time horizon. The variables have a discrete sample space. And we assume no cycling affects the system. Thermal power plant is a facility responsible to produce electricity from thermal energy. It is a critical facility to people's life. 4.1 THERMAL POWER PLANTS Thermal Power plant is facility that is responsible of electricity generation. It mainly converts the thermal energy gains from burning the fossil fuel in the boiler, to a pressurized steam that moves to the turbine. In turbine the energy is converted into mechanical energy. Once the blades of the turbines rotates the moving part of the generator (rotor) starts to rotate too. Once rotor rotates, electrical charges are generated on the stator (static part of the generator). Then these charges move to the transformers, where the high voltage current is made up, Figure 4.1 represents a schematic diagram of thermal power plant. 4.1.1 Rankine Cycle In thermal power plants, many components and systems are interrelated in order to finalize the process. The process is mainly made up by the use of Rankine cycle. Figure 4.2 shows a simple Rankine cycle. The working principle of Rankine cycle is as follows: 1- Cold water enters the pump. 2- The pump pressurizes the water. 3- The water flows to the boiler, where the heat is transferred from the ignition of fuel into the pressurized water, to produce a pressurized steam. 42 Figure 4.1: Schematic representation of thermal power plant Source: Wikipedia Figure 4.2 Simple Rankine cycle Source :( Flynn 2003) 4- The pressurized steam flows into the turbine. The turbine structure contains blades that are designed to increase the collision between the pressurized steam and the turbine blades. When the collision happens, it forces the turbine to flow over its pivot which is connected to the generator rotor. Boiler Condenser Turbine Generator Pump 43 5- When the turbines pivot rotates, the rotor of the generator rotates too. The rotor is made up so it can generate a magnetic field. The stator (static part of the generator) is made up from copper windings, so when the rotor rotates, electrical charges are produced in the stator winding. 6- Electrical charges then flows to the transformer that is responsible to produce the high voltage. 7- Going back to the turbine, after the collision, pressure of the steam drops down. The steam flows to the condenser, which is mainly made up from long pipes that are used to cold the steam, and condense it to water again. Figure 4.3 represents the process flow in thermal power plants using Rankine cycle. Figure 4.3: process flow in thermal power plant using Rankine cycle 44 4.1.2 Components in Rankine cycle In Rankine cycle, there are five basic components related to the efficiency of the cycle, and related to the electricity generation. These are shown in Figure 4.4 in the order below: Turbine: a rotating machine, covered with blades and it is rotating around its pivot. Boiler: the combustion chamber, where the fuel is burned and chemical energy is transformed into heat. Condenser: a device combined from long pipes used to increase the heat transfer area, which leads to cool the moving fluid. Pump: Fluid machinery. A device used to increase the pressure of the flowing fluid. Generator: it is made up from rotor (magnet), stator (windings). Where these combination can generate electricity using magnetic fields. Figure 4.4: Components of Rankine Cycle 45 4.1.3 Systems of Thermal Power Plants In Rankine cycle, different systems are interacted together simultaneously to accomplish the jobs. These systems are: Pumping system: consists of the pumps used in the cycle, which are used to increase the pressure of the moving fluid. Boiler System: the system used to produce the heat energy from burning the fuel. Piping System: all the pipes used to connect the systems with each other. Turbine System: the system related to the turbine rotates, and extracting heat and pressure energy and converting them to mechanical energy. Generating System: the system responsible of generating electrical current by changing mechanical energy into electrical. Transforming System: the system responsible of transforming the electrical current generated in the generator, into high voltage current using electromagnetic induction. Condensing System: the system is responsible of condensing the low pressure steam to water, by extracting the heat energy from it. Controlling System: the system responsible of controlling the other systems, and consists of all sensors and gauges. 4.2 MODELING WITH DYNAMIC BAYESIAN NETWORKS Thermal power plant is a very complex system. Analyzing it, is not an easy job, and to maintain its reliability as high as possible is the main aim of the personnel there, because it is a critical facility that many other facilities depend on, e.g. hospitals, airports, industry... etc. We propose to use dynamic Bayesian networks to model and analyze such a system. Both diagnosis and prognosis of the system at each period of time are possible with DBNs. The 46 data provided by the controlling system is used as evidence in the network, and then the states (reliability) of the other systems can be predicted using DBNs. Diagnosis can be performed within the static system at each period with the help of Bayesian network. Once conditional probabilities of the interrelated system are defined with DBN representation, evidence is collected via the data from the controlling system. Then using the evidence gathered so far the states (reliability) of the interrelated systems can be calculated. Prognosis can also be performed using Dynamic Bayesian networks. After identifying the transitional probabilities that define the aging of the systems (degradation), and using the evidence gathered so far the states (reliability) of the systems for the future periods can be predicted. In this work, we model the thermal power plant using DBNs; the static network of the problem represented with a BN is given in Figure 4.5. While building the model, some assumptions are made. These assumptions are: 1- We assume no cycling affects the system. 2- Maintenance activities are performed by either repairing or replacing components or gauges. All kinds of gauge calibration and repair activities are DVVXPHG WR EH SHUIHFW UHSDLU DFWLYLWLHV. 7KDW¶V ZK\ WKHVH DUH FDOOHG UHSODFHPHQWV since they bring the component or gauge to the perfect working condition. 3- We assume a discrete time horizon. And we assume a discrete sample space for each node in the model. 4- We assume the observations are collected at the beginning of each period. 5- We assume each component can either works or fails in a period. Once a component or gauge is replaced in a period, its working probability is set to 1. We classify the nodes used in the BN model to represent thermal power plants according to their duties. These are replaceable nodes, process nodes, observable nodes and non-aging nodes. The detailed explanations of these nodes are given subsequently. 47 F ig u re 4 .5 : B a y es ia n N et w o rk f o r th er m a l p o w er p la n t 48 4.2.1 Replaceable Nodes The main replaceable nodes are: Pump: representing the pumping system. Condenser: representing the condensing. Generator: representing the generating system. Boiler: representing the boiler system. Transformer: representing the transformation system. Gauge nodes are the nodes where the measuring devices are located. Gauge nodes used to give readings to the observations nodes. They are directly connected to the hidden process we are interested in. In the BN representation, the following gauge types are used: Temperature gauges: two gauges are used. One is used after condensing process and the other one is used after boiling process. Pressure gauges: three pressure gauges used. One is used after pumping the water in order to generate pressurized water, the second one is used after the pressurized steam, and the last one is used after the turbine rotation to check the efficiency of the process. Tachometers: two tachometers used. One is used to check the pump rotation, and the other one is used to check the turbine rotation. Voltmeter: only one volt meter is used to check the voltage of the produced current. 4.2.2 Process Nodes Process nodes are the nodes where two parent nodes acts together to produce a child. The child here is the process node. The following processes are defined as the result of the interaction between the parent components: 49 Cold water: is the result of the condenser. Plate rotation: is the result of the interaction between pump and electricity. Pressurized water: is the result of the interaction between cold water and plate rotation. Ignition: is the result of the interaction between boiler and fuel. Heat: is the result of ignition. Pressurized steam: is the result of the interaction between of pressurized water and heat. Turbine rotation: is the product of the interaction between pressurized steam and turbine. Pressure drop: is the result of turbine rotation that is happened to the steam. Electricity: is the result of the interaction between turbine rotation and generator. High voltage: is the result of the interaction between electricity and transformer. 4.2.3 Observable Nodes Observable nodes are the nodes used to collect the readings from the gauge nodes. The following observable nodes are used to collect data from the gauges and sensors of the controlling system: Temp 1 is the temperature read on TempG1 (temperature gauge 1), after measuring the temperature of cold water process. Temp 2 is the temperature read on TempG2 (temperature gauge 2), after heat is produced from the ignition. press1 is the pressure read on PRESSURE GAUGE 1 (pressure gauge 1), after producing pressurized water from the interaction between plate rotation and cold water. press2 is the pressure read on PRESSURE GAUGE 2 (pressure gauge 2), after producing pressurized steam by the interaction between pressurized water and heat. 50 press3 is the pressure read on PRESSURE GAUGE 3 (pressure gauge 3), after the turbine rotation to check the pressure drop. RPMM1 is the tachometer RPM1 reading after the interaction between pump and electricity to produce plate rotation. RPMM2 is the tachometer RPM2 reading after interaction between turbine and pressurized steam that produces turbine rotation. Voltage is the reading on the volt meter, which measures the high voltage produced by the interaction between electricity and transformer. 4.2.4 Non-aging (non-degrading) nodes: The non-DJLQJ FRPSRQHQWV DUH WKH FRPSRQHQWV ZKRVH UHOLDELOLW\ GRHVQ¶W FKDQJH WKURXJK time (time independent). In the model there are three non-aging components. These components are: Fuel: is the main input for the boiler. It contains the chemical energy required for the ignition. Pump electricity: is the electricity needed to make the pump work. Turbine: turbine is the heart of the facility, and generally turbines have a very long life time. Another factor to consider it as non-aging component is that when it stops the whole system stops. 51 Table 4.1 categorize the nodes into groups and specifies their duties, and the related nodes in each group. Table 4.1, Types of nodes in the Bayesian Network of thermal power plant model Group of nodes Nodes definitions Related nodes Duty of the node Observable nodes All sensor and gauges readings. Temp1 Temp2 Gives information about the process states. Press1 Press 2 Press3 RPMM1 RPMM2 Voltage Easy to replace and control nodes All gauges and sensors Temperature Gauge1 Temperature Gauge 2 The devices used to measure the process states. PRESSURE GAUGE 1 PRESSURE GAUGE 2 PRESSURE GAUGE 3 RPM meter 1 RPM meter 2 Volt meter Main replaceable nodes All the aging components used to drive the plant Pump Condenser Produces the main inputs of the processes. Generator Boiler Transformer Process nodes All the children of the main components interactions, or the children of another processes. Pressurized water. Pressurized steam Nodes where processes happen. High voltage Cold water Plate rotation Heat Turbine rotation Pressure drop Electricity Ignition Non-aging components All the time independent components (their UHOLDELOLWLHV GRQ¶W change through time). Fuel Pump electricity Critical to other process nodes (needed to perform other processes). Turbine 52 4.3 ESTABLISHING THE MODEL The BN model is built on Matlab using BNT (Murphy 2002), and verified using GENIE package. The model consists of 34 nodes. Five of them are main replaceable components, and eight of them are gauges. There are eight observable nodes in the model, where each one of them is connected with one gauge. There are three non-degrading components in the model. Remaining nodes are divided into two groups; the first one is critical process group, where the second one is defined as regular process group. 4.3.1 Main replaceable components The main replaceable components are the components having high repair cost and critical to the process of the plant. In the model we have five of the main replaceable components that are: pump, condenser, boiler, generator and transformer. In this thesis, turbine is considered as the soul of all processes and it is assumed that it is not replaceable. The initial probabilities of each of these components are defined depending on the efficiency of each, and also by collecting data related of each one. The initial probabilities of these components are shown in Table 4.2. The states of each component are defined as working (W) or not working (NW). Table 4.2: Initial probability of main components Component Working Probability Not Working Probability Pump 0.980 0.020 Condenser 0.880 0.120 Generator 0.950 0.050 Boiler 0.988 0.012 Transformer 0.950 0.050 Turbine 1.000 0.000 The initial probabilities are defined according to the components efficiencies. 53 Pumps really have high reliabilities. They are simply an electrical motor with impeller. Failure condition is mainly because of burning the electrical motor of the pump or after working for a long time and because of cavitation. Condenser in this thesis consists of the condenser itself and the pipes. The main task of condenser it to supply cold water, but friction of pipes which can generate heat, makes the task of the condenser less reliable. Failure modes are either burning of the condenser motor, or leakage in the pipes. Generators are highly reliable components. Their failure modes take place either when the bearing of the rotor fails or when the windings of the stator burn. Boilers are combustion chambers. They are highly reliable components. Their failure mode can be failure in the ignition system or leakage in the body. Transformers are highly reliable components. Their failure modes are generally burning of the winding of the poles. 4.3.2 Gauge Components The gauge components are devices to measure the efficiency of the processes in the system. Temperature gauges pressure gauges; tachometers and voltmeters are the gauge components defined in the model. The main characteristics of these components are that they need to be calibrated from time to time, in order to get accurate readings. There are four types of gauges used in this model as shown in Figure 4.6. They are: Temperature gauges: there are two of them, used to read temperature. Pressure gauges: there are three of them, used to read the pressure. Tachometer: there are two of them, used to read revolution per minute. Volt meter: there is one, used to read the voltage. 54 Figure 4.6: Types of gauges used in the model These components are used to control and check efficiency and reliability of each process. Failure modes of these gauges depend on the type of the gauge used. For example; there are many temperature reading device in the market. Like infrared temperature reading devices, anemometer measuring devices, and mercury temperature reading devices. The latter one is the most widely known temperature reading device. In thermal power plants, since when we want to read the temperature of a hot surface, the most logical one is to use infrared measurement devices. This device can fail either because of calibration error, or because of failure in the thermal sensors inside the device. For the pressure gauges, there are many types too. But the most famous one is the gauge having a spring inside. And it uses the formula Pr=F/A, where Pr is the pressure exerted on the spring, F is the force acting on the spring, A is the area where the force is exerted. As the pressure increases, the spring is compressed; accordingly the indicator reads the pressure. Failure modes related to this type are either failure in the spring inside or error in calibration of the indicator. Tachometers are devices that are attached to the rotating parts. There are many types of tachometers such as laser pointing or directly connected to the rotating part. When the rotating part starts to move, the tachometer starts to calculate the distance traveled through Gauges Temperature gauges Pressure gauges (PGs) Tachoometers (RPMs) Volt meter 55 time. Rotating parts normally have circular cross sections. By using circumference of the circle, we can calculate the speed of rotation as round per second (RPS) or round per minute (RPM). Failure modes of this type are either failure in the inside circuit of the device, or calibration error of speed or diameter. The last type of gauges used is voltmeter. Volt meter is used to read the voltage of the current. It is device where one pole is calibrated to earth voltage (0 V) and this voltage is used as a reference value. Then the second pole is attached to the current and the difference between the two poles is the reading. Failure modes of these gauges are either failure in the inside circuit or calibration error. The states of these components in the model are assume to be working (W) or not working (NW), and the initial probabilities related to these gauges are shown in Table 4.3. Table 4.3: Initial probabilities of gauges Component Working probability Not working probability Temperature gauge 1 0.93 0.07 Temperature gauge 2 0.93 0.07 Pressure gauge 1 0.95 0.05 Pressure gauge 2 0.95 0.05 Pressure gauge 3 0.95 0.05 RPM meter 1 0.90 0.10 RPM meter 2 0.90 0.10 Volt meter 0.90 0.10 The initial probabilities of the gauges are given according to their efficiencies. The duties of the gauges used in the model are as follows: 1- Temperature gauge 1 is used to read temperature of cold water. 2- Temperature gauge 2 is used to read temperature of the steam comes from boiler. 56 3- Pressure gauge 1 is a pressure gauge used to read the pressure of the water after pumping it. 4- Pressure gauge 2 is a pressure gauge used to read pressure of the steam. 5- Pressure gauge 3 is a pressure gauge used to read the pressure drop after collision in the turbine. 6- RPM meter 1 is a tachometer used to read the speed of the pump plate rotation. 7- RPM meter 2 is a tachometer used to read the speed of turbine rotation. 8- Volt meter is used to read the voltage results from the transformer. 4.3.3 Process Nodes A Process node is a child of two parent nodes, where another two components interact together to perform the process (child). In the BN model, there are two types of processes. One consists of the hidden processes where we have no direct observations. The other consists of the observation processes of which we are able to directly observe the states. The process nodes defined in the model are shown in Figure 4.7. 4.3.3.1 Hidden Processes +LGGHQ 3URFHVVHV DUH WKH SURFHVVHV ZKHUH ZH GRQ¶W KDYH DQ\ GLUHFW REVHUYDWLRQ RQ WKHP. Nevertheless there exists a causal relation between hidden process node and its parent nodes, which represents the interaction among these. So that their reliabilities can be predicted. They are representing the main processes in the plant. There are ten hidden processes in the model. Each process has two states: "A' means that the process is available whereas "NA" means that it is not available. 57 Figure 4.7: Types of process nodes The conditional probabilities of each of these hidden processes given the states of their parents are illustrated in Tables 4.4-4.13. Table 4.4: Conditional probabilities of cold water Condenser Not Working (NW) Working (W) Cold Water not available (NA) 0.99 0.20 Cold Water available (A) 0.01 0.80 E.g. the probability of Cold water to have "A" state when the state of the condenser is given as "W" is 0.8. Process Nodes Observable processes 1- temp1 2-temp2 3- press1 4- press2 5- press3 6- RPMM1 7-RPMM2 8- voltage Hidden Processes 1- Cold Water 2- Plate Rotation 3- Ignition 4- Heat 5- Pressurize Water 6- Pressurized Steam 7- Turbine Rotation 8- Pressure Drop 9- Electricity 10- High Voltage 58 Table 4.5: Conditional probabilities of plate rotation Pump Electricity NA A Pump NW W NW W Plate rotation NA 1.00 1.00 0.90 0.00 Plate rotation A 0.00 0.00 0.10 1.00 Table 4.6: Conditional probabilities of pressurized water Cold Water NA A Plate Rotation NA A NA A Pressurized Water NA 1.00 0.80 0.80 0.20 Pressurized Water A 0.00 0.20 0.20 0.80 Table 4.7: Conditional probabilities of ignition Fuel NA A Boiler NW W NW W Ignition NA 1.00 1.00 1.00 0.10 Ignition A 0.00 0.00 0.00 0.90 Table 4.8: Conditional probabilities of heat Ignition NA A Heat NA 1.00 0.00 Heat A 0.00 1.00 Table 4.9: Conditional probabilities of pressurized steam Pressurized water NA A Heat NW W NW W Pressurized steam NA 1.00 0.50 1.00 0.00 Pressurized steam A 0.00 0.50 0.00 1.00 59 Table 4.10: Conditional probabilities of turbine rotation Turbine NA A Pressurized Steam NW W NW W Turbine Rotation NA 1.00 0.99 0.10 0.00 Turbine Rotation A 0.00 0.01 0.90 1.00 Table 4.11: Conditional probabilities of pressure drop Turbine Rotation NA A Pressure drop A 0.60 0.00 Pressure drop NA 0.40 1.00 Table 4.12: Conditional probabilities of electricity Turbine Rotation NA A Generator NW W NW W Electricity NA 1.00 0.80 0.80 0.00 Electricity A 0.00 0.20 0.20 1.00 Table 4.13: Conditional probabilities of high voltage Electricity NA A Transformer NW W NW W High Voltage NA 0.80 0.80 0.80 0.20 High Voltage A 0.20 0.20 0.20 0.80 For example, in Table 4.13, P(High Voltage=A| Electricity=NA and Transformer=W)=0.2. 60 4.3.3.2 Observable processes Observable processes are the nodes resulting from readings of the measurments of the hidden processes. Hence they give insights about the reliability and efficiency of the hidden process to which they are directly connected to. In the model there are eight observable nodes. The observable nodes are located so that each of them is the child node of a measuring device from one side, and of a hidden process from the other side. The states of each observable node are defined as follows: For temperature readings (Temp), there are three states (Hot, Medium, and Cold). For pressure readings (Press), there are three states (High, Medium, and Low). For the tachometer readings (RPMM), there are two states (Fast, Slow). For the voltage readings, there are two states (High, Low). The conditional probabilities of the observable nodes given the states of their parents are illustrated in Table 4.14-4.21. Table 4.14: Conditional probabilities of temp1. Temperature gauge 1 NW W Cold Water NW W NW W Hot 0.05 0.00 0.05 0.00 Medium 0.95 0.10 0.95 0.05 Cold 0.00 0.90 0.00 0.95 Table 4.15: Conditional probabilities of temp2. Heat NA A Temperature gauge 2 NW W NW W Cold 1.00 0.90 0.80 0.00 Medium 0.00 0.10 0.20 0.30 Hot 0.00 0.00 0.00 0.70 61 Table 4.16: Conditional probabilities of press1. Pressurized water NA A PRESSURE GAUGE 1 NW W NW W Low 0.80 0.65 0.65 0.05 Medium 0.15 0.30 0.30 0.05 High 0.05 0.05 0.05 0.90 Table 4.17: Conditional probabilities of press2. Pressurized water NA A PRESSURE GAUGE 2 NW W NW W Low 0.80 0.65 0.65 0.05 Medium 0.15 0.30 0.30 0.05 High 0.05 0.05 0.05 0.90 Table 4.18: Conditional probabilities of press3. Pressure drop NA A PRESSURE GAUGE 3 NW W NW W High 0.05 0.05 0.05 0.05 Medium 0.15 0.30 0.30 0.05 Low 0.80 0.65 0.65 0.90 Table 4.19: Conditional probabilities of RPMM1 RPM meter 1 NW W Plate rotation NA A NA A Slow 1.00 0.95 1.00 0.05 Fast 0.00 0.05 0.00 0.95 62 Table 4.20: Conditional probabilities of RPMM2 RPM meter 2 NW W Turbine rotation NA A NA A Slow 1.00 0.80 0.90 0.00 Fast 0.00 0.20 0.10 1.00 Table 4.21: Conditional probabilities of voltage High voltage NA A Volt meter NW W NW W Low 1.00 0.80 0.80 0.00 High 0.00 0.20 0.20 1.00 4.3.4 Transition probabilities After defining the components and their related conditional probabilities, the model is built up using BNT, Murphy (2002). BNT is a tool box based on Bayes theory works in MATLAB environment. The model is coded in the static state first as shown in Figure 4.5 and verified using GENIE package. Then the dynamic model is built up using BNT and verified with GENIE again. Transitional probabilities of aging components are determined using mean time to failure (MTTF) under the assumption of exponential distribution as shown in Table 4.22 . The equations used to determine reliability of aging components is as follow: Ȝ (4.1) Reliability after 1 time period is given by: Reliability after 1 period = (4.2) Reliability decrement due to aging = 1- (4.3) 63 Table 4.22 Transition probabilities of aging components MTTF Ȝ WORKING AGING PUMP 15 0.066667 0.9355 0.0645 CONDNSER 18 0.055556 0.946 0.054 GENERATOR 25 0.04 0.9608 0.0392 BOILER 22 0.045455 0.9556 0.0444 TRANSFORMER 14 0.071429 0.9311 0.0689 TEMPG1 10 0.1 0.9048 0.0952 TEMPG2 10 0.1 0.9048 0.0952 PRESSURE GAUGE 1 8 0.125 0.8825 0.1175 PRESSURE GAUGE 2 8 0.125 0.8825 0.1175 PRESSURE GAUGE 3 10 0.1 0.9048 0.0952 RPM1 12 0.083333 0.92 0.08 RPM2 12 0.083333 0.92 0.08 VOLTMETER 5 0.2 0.8187 0.1813 In the model, it is assumed that only the main components and gauges are aging with a degradation level given in Table 4.22 . MTTF values are collected from different resources such as manufacturur data. After constructing the temporal relations of the aging components, dynamic Bayesian network of the model is built. The DBN of the model built with GENIE is given in Figure 4.8. Where the nodes with circled arcs are the aging components. 4.3.5 Independent variables All gauge nodes and their corresponding hidden process nodes are independent from each other. Take for instance High voltage and volt meter. Given evidence on High voltage, working probability of Volt meter is not effected. Similarly given evidence on Volt meter, availability probability of High voltage is not effected. To show this, an inference test is applied to the subsection of nodes High voltage, volt meter and voltage in the model. The test is applied by using different evidences and we observe the change in the nodes. The results of the test are shown in table 4.23. In table 4.23, given that High voltage is available, 64 the working probability of Volt meter is 0.9 and given that the high voltage is not available, the working probability of volt meter is still 0.9 as also in the no evidence case. Similar results are obtained for the availability of High voltage given evidence on Volt meter which is 0.8, again same as in the no evidence case Table 4.23: Evaluating the working probability of variables under various evidences Evidence Probability HV="A" VM="W" V="High" - 0.8 0.9 0.772 HV="A" - 0.9 0.92 HV="NA" - 0.9 0.18 VM="W" 0.8 - 0.84 VM="NW" 0.8 - 0.16 V="High" 0.953 0.979 - V="Low" 0.281 0.632 - 4.3.6 Domain of variables The domain of variables in the models changes according to the type of nodes. They are either working (W) or not working (NW) for components. And they are available (A) and not available (NA) for hidden processes. And the domain of slow/fast/medium/high/low/cold/hot is related to the observable processes. The domains of all variables are shown in table 4.24. 65 Table 4.24: Domain of variables Pump W, NW Condenser W, NW Generator W, NW Boiler W, NW Transformer W, NW Temperature gauge1 W, NW Temperature gauge 2 W, NW Pressure gauge 1 W, NW Pressure gauge 2 W, NW Pressure gauge 3 W, NW RPM1 W, NW RPM2 W, NW Volt meter W, NW Cold water A, NA Plate rotation A, NA Pressurize water A, NA Ignition A, NA Heat A, NA Pressurized steam A, NA Turbine rotation A, NA Pressure drop A, NA Electricity A, NA High voltage A, NA Temp 1 Hot, Medium, Cold Temp 2 Cold, Medium, Hot Press 1 Low, Medium, High Press 2 Low, Medium, High Press 3 High, Medium, Low RPMM1 Slow, Fast RPMM2 Slow, Fast Voltage Low, High Fuel A, NA Pump electricity A, NA Turbine W, NW 66 F ig u re 4 .8 : D y n a m ic B a y es ia n N et w o rk f o r th er m a l p o w er p la n t 67 5. METHODOLOGY Two algorithms are proposed in this thesis. First one is the "Single Critical Process" algorithm (SCP), while the other one is the "Multiple Critical Processes" algorithm (MCP). The two algorithms differ from each other in the detection of degradation, and also the selection of the replaceable components. The algorithms are based on collecting limited number of observations at each period. Then they check whether to do repair or not, based on the reliabilities of predefined critical processes. If the reliability of these critical processes fall under a predefined threshold, then the algorithms decide on the repair action needed. The outline of the algorithms is given in Figure 5.1. The first step is to simulate the observations by generating a random number, and comparing it to the cummulative probability of the observable nodes. If the simulated value is undesirable, then the related gauge is checked/repaierd. After gauge check/repair, observations are simulated again. Using these observations, reliability of critical process is infered. If the reliability falls below the threshold, repairs are done and observations are simulated again. Then reliability of the critical process is infered again. Otherwise, if the reliability isn't below the threshold, then it continous with the next period. Figure 5.1: Outline of the algorithms 68 5.1 ONE CRITICAL PROCESS ALGORITHM Single critical process algorithm (SCP) uses one process as a critical process. Main repair decisions are taken according to this process by selecting the component with the minimum working reliability. The critical process here is the high voltage, which is the last process in the model. Decisions to be taken are those related to main component repairs. 5.1.1 Notation of SCP The notations used while building the SCP model is given below. Sets and Indices I: observable node set, I= {temp 1, temp2, press1, press2, press3, RPMM1, RPMM2, voltage}. J: Gauge set, J= {Temperature Gauge 1, Temperature Gauge 2. Pressure Gauge1, Pressure Gauge 2, Pressure Gauge 3, RPM1, RPM2, Volt Meter}. K: replaceable main component set. K= {pump, condenser, boiler, generator, transformer} i: observable node, . j: gauge component, . k: replaceable main component, . Variables : State of observable node i in time t : State of gauge component j in time t : State of critical process in time t : State of main component k in time t 69 : Evidence list : Efficiency measure of component k in period t. Parameters Lc: threshold reliability for the critical process node. T: planning horizon. There exists a one to one relation between observable node and gauge which is given in table 5.1. Table 5.1: Observable nodes and their related gauges Observable node I Gauge j Temp1 Temperature gauge 1 Temp2 Temperature gauge 2 Press 1 Pressure gauge 1 Press 2 Pressure gauge 2 Press 3 Pressure gauge 3 0RPMM1 RPM1 RPMM2 RPM2 Voltage Volt meter 5.1.2 Algorithm of SCP Steps of the SCP algorithm are as follows: The algorithm works in the following steps 1- Initialize t=1 2- Simulate observations oit for each observable node . A. If the simulated value is fast/high/hot; then continue with (3) 70 B. Else, replace the related gauge j and update evidence C. Simulate the observable node i using the updated evidence. 3- Evidence is updated using the simulated observation. 4- Infer critical process reliability | 5- If | then continue with (7) 6- Else, do th followings a- Initialize remaining main component list b- Calculate c- Select the component to replace with min d- Update evidence e- Set the evidence of observable nodes free | f- Simulate observations For each observable node i g- Update evidence using the simulated observations h- Update remaining main component list | i- Infer critical process reliability j- If | , then continue with (6.b) 7- Increase t, 8- If then stop 9- Else continue with (2) Let Is an efficiency measure of component k in period t when a repair is planned. It is the reliability of component k in period t. FEM is the fault effect myopic approach and the FEL is the fault effect look-ahead approach (Özgür-hQODNÕQ DQG %LOJLo 121). Namely it is calculated as: | 71 | | The difference between the mentioned approaches is selection of the replaceable component. In SCP critical process reliability is evaluated and if it falls below the threshold, then the reliabilities of the replaceable component are evaluated, the replaceable component that has the minimum reliability is selected. In FEM the probability of the critical process is evaluated and if it falls below threshold, it is set to the evidence as not available "NA" in that period. And its effect is evaluated over the replaceable components. And the replaceable component with the minimum reliability is selected. In FEL the reliability of the critical process is evaluated, and then if it falls below the threshold it is set to the evidence from that period up to the planned horizon as not available "NA". And its effect is evaluated over the replaceable component, and then the one with the minimum reliability is selected. 5.2 MULTIPLE CRITICAL PROCESSES ALGORITHM Multiple critical processes (MCP) logic, divides the system into many critical processes where the outputs of these critical processes is the main outputs of the system. Each critical process depends on a predefined number of main replaceable components. At each time period; the reliability of the critical processes are evaluated. The algorithm takes the critical process with the minimum reliability and checks whether it falls below the threshold or not. If it falls, then the repair decision is made according to the minimum working reliability of the replaceable components. In this model, there are three critical processes defined. Pressurized water, pressurized steam and high voltage. Pressurized water is the descended process of pump and condenser. If the working reliability of the pressurized water is not in the acceptable range, then the problem can be either from pump or condenser. The pressurize steam process is the descended of the boiler and pump and condenser. But since pump and condenser are already cover by pressurized water, then any problem in the working reliability is directly assigned to the boiler. Finally, the high voltage is the descended of all main replaceable components (pump, condenser, boiler, generator and transformer). And since pump and condenser are covered by pressurize water. And boiler is 72 covered by pressurized steam. Only generator and transformer are assigned to this process. The groups of MCP are shown in Figure 5.1. 73 F ig u re 5 .2 : G ro u p s o f M C P cr it ic a l n o d es P re ss u ri ze d w at er g ro u p H ig h v o lt ag e G ro u p P re ss u ri ze d s te am g ro u p 74 5.2.1 Notation of MCP Notations and parameters used in the MCP model are given below. Sets and Indices I: observable node set, I= {temp 1, temp2, press1, press2, press3, RPMM1, RPMM2, voltage}. J: Gauge set, J= {Temperature Gauge 1, Temperature Gauge 2. Pressure Gauge1, Pressure Gauge 2, Pressure Gauge 3, RPM1, RPM2, Volt Meter}. Kh: replaceable main component relate to critical process h. Kh= {pump, condenser, boiler, generator, transformer}. H: critical process set. H= {Pressurized water, Pressurized steam, High voltage}. = {pump, condenser}. = {boiler}. = {generator, transformer}. i: observable node, . j: gauge component, . k: replaceable main component, . h: critical process . Variables : state of observable node i in time t. : state of gauge component j in time t. : state of critical process h in time t. 75 : State of main component k in time t. : Evidence list. Parameters Lc: threshold reliability for the critical process node. Lr: threshold reliability for the replaceable component. T: planning horizon. 5.2.2 Algorithm of MCP Steps for the MCP model are given below. The algorithm works in the following steps 1- Initialize t=1 2- Simulate observations Oit for each observable node i a. If the simulated value is fast/high/hot; then continue with (3) b. Else, replace the related gauge j and update evidence c. Simulate the observable node i using the updated evidence. 3- Evidence is updated using the simulated observation. 4- Initialize remaining critical process list, 5- Infer critical process reliability for each critical process h | 6- Select the process with minimum reliability | 7- If | then continue with (11) 8- Else do the followings a- Initialize remaining main component list b- Calculate reliability of related main components 76 | c- Select the component with minimum reliability | d- Update evidence e- Set the evidence of observable nodes free | f- Simulate observations for each observable node i g- Update evidence using the simulated observations h- Update remaining main component list | i- Infer critical process reliability j- If | , then continue with (8.b) 9- Else update remaining critical process list | 10- Continue with (5) 11- Increase t, 12- If then stop 13- Else, continue with (2( 77 6. RESULTS AND EVALUATION The proposed approaches are executed in Matlab environment using BNT toolbox developed by Murphy. Results are collected for each approach. Each of SCP, FEM and FEL is executed for 20 replications; each replication contains a planning horizon of 50 periods. Each approach is evaluated for two different thresholds. 6.1 DESIGN OF EXPERIMENTS The problem in hand is to minimize total main replaceable components repairs over a predefined planning horizon in thermal power plants, to develop a proactive maintenance schedule. The planning horizon is determined such that each main replaceable component is replaced at least once. So the planning horizon is chosen to be 50 years which is a reasonable number for a long run maintenance planning of thermal power plants. Each period is taken as 1 year which is enough to see the immediate effect of main components repairs and gauge checks/repairs. The threshold reliability which is set as a constraint to be satisfied by the critical processes is taken to be initially 0.5. We like to compare the methods with each other statistically. For each method we take a replication number of 20 replications. Each replication contains a planning horizon of 50 years. In the comparison we check the total number of main replaceable components in each replication and the total check/repairs of gauges in each replication. Then we make a sensitivity analysis for each method by evaluating it using threshold of 0.75. A comprehensive evaluation data for each method using threshold of 0.5, and replication number of 20 replications are given in Tables 6.1-6.4. In these tables a total number of main replaceable components repairs and the total number of gauge checks/repairs are given together with their mean DQG VWDQGDUG GHYLDWLRQ ı(. In the tables below, the seventh row shows the total repair number of main replaceable components in each replication. The sixteenth row shows the total repair number of gauges at each replication. The eighteenth row shows the total repair done at each replication (main replaceable components and gauges). The nighnteenth and tweenteenth rows show the time 78 ellapsed by each replication in seconds and minutes respectively. The 22nd column shows the mean of the 20th replications for each row. The 23rd colunm shows the standard deviation of each row. The total number of repairs for main replaceable components and also for gauges differ significantly in each replication because of the observations simulated according to the UDQGRP QXPEHU JHQHUDWLRQ SURFHGXUH. 7KDW¶V Zhy the mean and standard deviations of 20 replications give us a general idea about the performances of the methods. According to the mean total number of repairs of main replaceable components given in tables 6.1-6.3, it may be concluded that SCP and FEM are doing less repairs than FEL. Also for the mean total number of repairs of gauges, it may be concluded that SCP is doing less total gauge checks/repairs that the other two. But one has to make statistical analysis in order to say wether these differences are significant or not. That is why we perform one way ANOVA test for main replaceable components and also for gauges in the next section. MCP model is doing much more repairs than the other methods. MCP keeps the reliability of all critical processes over the threshold, so once a critical process falls below the threshold repairs take place. 79 T a b le 6 .1 : S C P e v a lu a ti o n d a ta a t 0 .5 th re sh o ld 80 T a b le 6 .2 : F E M e v a lu a ti o n d a ta a t 0 .5 th re sh o ld 81 T a b le 6 .3 : F E L e v a lu a ti o n d a ta a t 0 .5 t h re sh o ld 82 T a b le 6 .4 : M C P e v a lu a ti o n d a ta a t 0 .5 t h re sh o ld 83 A comprehensive evaluation data for each method using threshold of 0.75, and replication number of 20 replications are given in Tables 6.5-6.9. In these tables a total number of main replaceable components repairs and the total number of gauge checks/repairs are JLYHQ WRJHWKHU ZLWK WKHLU PHDQ DQG VWDQGDUG GHYLDWLRQ ı(. T a b le 6 .5 : S C P e v a lu a ti o n d a ta a t 0 .7 5 th re sh o ld 84 T a b le 6 .6 : F E M e v a lu a ti o n d a ta a t 0 .7 5 t h re sh o ld 85 T a b le 6 .7 : F E L e v a lu a ti o n d a ta a t 0 .7 5 t h re sh o ld 86 T a b le 6 .8 : M C P e v a lu a ti o n d a ta a t 0 .7 5 th re sh o ld 87 6.2 ANALYSIS OF EXPERIMENTS After data evaluation of the approaches, a comparison is done over them using ANOVA. The test is applied over SCP, FEM and FEL since all of these approaches depends on only one critical process to decide on repairs. The tests are done over thresholds of 0.5 and 0.75 for the total number of main replaceable components repairs in each replication. Also the tests are done on the total number of gauge checks/ repairs in each replication using thresholds of 0.5 and 0.75. A summary of the data evaluated for each method is given in Table 6.9. In tables 6.10-6.13, ANOVA test results are given. From these test results, since the P- values in all one factor ANOVA tables are so high, we can say that performances of the methods according to the total number of repairs of main replaceable components and gauges are not significantly different from each other at the selected threshold values of 0.5 and 0.75. Multi critical processes algorithm is different from single critical process algorithm in structure. Because in each period it ensures that three critical processes are functioning over the threshold while single critical process algorithm, which is performed by SCP, FEM and FEL approaches, checks only one critical process. So it is not reasonable to compare MCP method with SCP, FEM and FEL methods. 7KDW¶V ZK\ ZH WUHDW 0&3 PHWKRG VHSDUDWHO\. A summary of MCP data evaluation for 10 replications at thresholds 0.5 and 0.75 respectively is given in Table 6.14. According to table 6.14, the mean number of repairs of main replaceable components given in MCP method is 11.8 which are obviously greater than the single process algorithms. When the threshold is increased to 0.75, this number further increases to 26, which is again obviously higher than the related results of single critical process algorithms. 88 T a b le 6 .9 : S u m m a ry o f d a ta e v a lu a te d b y S C P , F E M a n d F E L a t th re sh o ld o f 0 .5 a n d 0 .7 5 89 Table 6.10: ANOVA test results for main replaceable components at threshold of 0.5. SOURCE DF SS MS F P METHOD 2 4.03 2.02 0.24 0.785 ERROR 57 473.70 8.31 TOTAL 59 477.73 Table 6.11: ANOVA test results for main replaceable components at threshold of 0.75. SOURCE DF SS MS F P METHOD 2 0.6 0.3 0.03 0.968 ERROR 27 248.2 9.19 TOTAL 29 448.8 Table 6.12: ANOVA test results for gauges at threshold of 0.5. Table 6.13: ANOVA test results for gauges at threshold of 0.75. SOURCE DF SS MS F P METHOD 2 139 69 0.55 0.581 ERROR 57 7214 127 TOTAL 59 7353 SOURCE DF SS MS F P METHOD 2 23 11 0.11 0.898 ERROR 57 5948 104 TOTAL 59 5971 90 T a b le 6 .1 4 : S u m m a ry o f M C P d a ta e v a lu a ti o n a t th re sh o ld 0 .5 a n d 0 .7 5 91 6.3 PREDICTED MAINTENANCE PLAN Tables 6.15-6.17, shows the predicted average repairs for each main replaceable component and each gauge. The average is collected from 20 repetitions for each method. Table 6.15, shows the predicted average repairs for SCP, FEM and FEL at 0.5 thresholds. Table 6.16, shows the predicted average repairs for SCP, FEM and FEL at 0.75 thresholds. Table 6.17 shows the predicted repairs for MCP at threshold of 0.5 and 0.75 for 10 repetitions. Table 6.15: Predicted maintenance schedule using SCP, FEM, FEL at threshold of 0.5 Component SCP FEM FEL Pump 4.05 3.45 4.15 Condenser 0.05 0.10 0.20 Generator 0.30 0.45 0.50 Boiler 0.55 0.75 0.40 Transformer 1.60 1.80 1.85 Temperature gauge 1 8.75 8.60 9.70 Temperature gauge 2 27.70 28.40 28.15 Pressure gauge 1 28.25 29.45 28.50 Pressure gauge 2 31.25 31.35 31.80 Pressure gauge 3 8.30 8.05 8.25 RPM1 38.75 39.25 39.70 RPM2 6.90 7.30 7.20 Volt meter 18.45 19.15 18.30 92 Table 6.16: Predicted maintenance schedule using SCP, FEM, FEL at threshold of 0.75 Component SCP FEM FEL Pump 4.9 4.5 4.3 Condenser 0.1 0.1 0.1 Generator 0.3 0.2 0.7 Boiler 0.7 0.8 0.3 Transformer 1.8 2.2 2.4 Temperature gauge 1 8.7 8.7 8.8 Temperature gauge 2 27.2 27.7 26.0 Pressure gauge 1 29.7 28.6 28.5 Pressure gauge 2 29.6 31.4 33.2 Pressure gauge 3 8.6 8.1 8.7 RPM1 39.4 39.1 39.6 RPM2 6.8 6.9 6.9 Volt meter 19.4 19.0 19.0 93 Table 6.17: Predicted maintenance schedule using MCP at thresholds of 0.5 and 0.75 Component 0.5 threshold 0.75 threshold Pump 4.10 8.00 Condenser 0.50 1.20 Generator 0.40 1.40 Boiler 5.20 12.90 Transformer 1.60 2.50 Temperature gauge 1 8.90 11.00 Temperature gauge 2 22.10 24.20 Pressure gauge 1 28.00 28.90 Pressure gauge 2 31.20 31.00 Pressure gauge 3 7.90 8.00 RPM1 42.10 37.10 RPM2 7.00 9.30 Volt meter 19.50 20.50 94 7. CONCLUSION Thermal power plants are very complex systems involving many interrelated components and subsystems. Developing maintenance plans of such systems is also a sophisticated procedure. In the literature generally preventive and reactive maintenance methods are proposed for developing maintenance scheduling of thermal power plants. Also there are some studies using RCM approach which are applied to only subsystems or components of WKHUPDO SRZHU SODQWV VXFK DV WXUELQH ERLOHU«HWF. So this study distinguishes from the literature since proactive maintenance of the whole thermal power plant system is modeled with all of necessary subsystems and components. In this study, we propose methodologies (SCP and MCP) to develop a proactive maintenance schedule for thermal power plants, which aim to minimize total number of repairs. The methodologies are modeled using DBN and coded in Matlab using BNT toolbox. The reliabilities of critical processes are inferred. Then repair decisions are made according to a comparison done using the inferred critical process reliability and a predefine threshold value. We compare performances of these methods with other component selection approaches (FEM and FEL) available in the literature. The results show that SCP and FEM are doing fewer repairs than FEL while MCP is doing much more repairs than the other methodologies. Then we compare the results of SCP, FEM and FEL statistically using ANOVA tests. The results show that since the P-value in one factor ANOVA table is so high, we can conclude that performances of the methods are not significantly different from each other. The ANOVA test is done over the total number of main components repairs and on the total number of gauge checks/repairs. In this work, minimizing total number of repairs is the main objective. However repair costs can be added to the model to minimize the total repair cost for future study. In the MCP model, total number of repairs is higher than the other methods. For future study, threshold reliability for also replaceable components can be added to the MCP model that will allow checking the reliability of main components after the check of the critical processes. In this way, performance of the method can be improved. There are a few other 95 component selection approaches proposed in the literature. Those can also be used in the comparison of performances. 96 REFERENCES Books E.Barlow, R. & Porschan, F., 1965. Mathmatical Theory of Reliability. New York: John Wiley. M.Smith, A., 1992. Reliability Centered Maintenance. Dhillon, B. S., 2002. Engineering Maintenance- A Modern Approach. 1 ed. London, Washington D.C, New York: CRC press. Keçeio÷lu, D., 2002. Reliability Engineering Handbook V.1. Flynn , D., 2003. Thermal Power Plant Simulation and Control. London/United Kingdom: Institute of Engineering and Technology. Korb, K. B. & Nicholson, A. E., 2004. Bayesian Artificial Intellegence. London: Chapman & Hall/CRC Press UK. Koller, D. & Freidman, N., 2009. Probabilistic Graphical Methods ( Princibles and Techniques). Saadat, H., 2011. POWER SYSTEM ANALYSIS. 3rd edition ed. Ushakov, I., 2012. Probabilistic Reliability Models. s.l.:John Wiley & sons inc. 97 Periodicals Pearl, J., 1985. Bayesian networks: a model of self-activated memory for evedintial reasoning. the 7th conference of the cognitive science society . Chockie & K.Bjorkelo, 1992. Effective maintenance Practicies to manage system aging. Annual Reliability and Maintainability Syposium, Las Vegas, Nivada. C.W.Gits, 1992. Design of Maintenance Concepts. International Journal of Production Economics , pp. 217-226. Heckerman, D., Breese, J. S. & Rommelse, K., 1995. Troubleshooting under uncertainty. Akash, B. A., Mamlook, R. & Mohsen, M. S., 1999. Multi-criteria selection of electric power plants using analytical hierarchy process. Electric Power Systems Research 52, p. 29±35. D'Ambrosio, B., 1999. Inference in Bayesian networks. AL magazine, Volume 20. Ballzus, C., Frimannson, H., Gunnarsson, G. I. & Hrolfsson, I., 2000. The Geothermal Power Plant At Nesjavellir, Iceland. Proceedings World Geothermal Congress 2000,Kyushu - Tohoku, Japan, May 28 - June 10, 2000, pp. 3109-3114. Lapa, C. M., Pereira, C. M. and A.Mol, A. C. d., 2000. Maximization of a nuclear system availability through maintenance scheduling optimization using genetic algorithm. Nuclear Engineering and Design, Volume 194, pp. 219-231. Kuenzer, A. et al., 2001. An Empirical Study Of Dynamic Bayesian Networks For User Modeling. Sergaki, A. & Kalaitzakis, K., 2002. A Fuzzy knowledge based method for maintenance planing in power system. Reliability Enginering And Safety Systems 77, pp. 19-30. Miller, S. et al., 2002. Developmenatm And Implementation Of A Rellability Centered Mantenance Program. P.Weber & L.Jouffe, 2003. Reliability Modeling With Dynamic Bayesian Networks. 98 P.Weber, P.Munteanu & Jouffe, L., 2004. Dynamic Bayesian Networks Modeling the dependability of systems with degradations and exogenous constraints. 11th IFAC Symposium on information control problems in manufacturing(INCOM)`04,Salvador,Brazil. Koca, E. & Bilgic, T., 2004. Troubleshooting Approach with Dependent Actions. ECAI 16 th eupropean conference on artificial intellegence, Valencia,Spain, Volume 110, pp. 1043- 1044. Bai, C. C., 2005. Bayesian Networks based software reliability prediction with an operational profile. The Journal of system and software, Vol 77, pp. 103-112. Krishnasamy, L., Khan, F. & Haddara, M., 2005. Development of a risk based maintenance (RPM) strategy for a power generation plant. Journal of Loss prevention in the process industries 18, pp. 69-81. Kryszczuk, K., Richiardi, J., Prodanov, P. & Drygajlo, A., 2005. Error Handling In Multimodal Biometric System Using Reliability Measures. 0LOXWLQRYLü '. /XþDQLQ 9. 1. 5HODWLRQ EHWZHHQ 5HOLDELOLW\ DQG $YDLODELOLW\ RI railway vehicle. FME Transactions. Richiardi, J., Prodanov, P. & Drygajlo, A., 2005. A probabilistic measure of modality reliability in speaker verification. S.Montami, L.Portinale & A.Bobbio, 2005. Dynamic Bayesian Networks For Modeling Advanced Fault Tree Feature In Dependability Analysis. Weber, P. & Jouffe, L., 2006. Complex System Reliability Modeling With Dyamic bayesian Object oriented Networks (DOOBNs). Reliability Engineering & System Safety 91/2, pp. 149-162. A.R.Majeed & N.M.Sadiq, 2006. Availability & Reliability Evolution Of Dokan Hydropower Station. Ieee Pes Transmission & Distribution Conference & Exposition Latin America/Venzuela. 99 BenSalem, A., Muller, A. & Weber, P., 2006. Dynamic Bayesian Networks In System Reliability Analysis. Boudali, H. & Dugan, J. B., 2006. A Continuous time bayesian netwroks reliability modeling and analysis framework. IEEE Transactions On Reliability, Vol : 55. Schimon, R. et al., 2006, September 4th ± 5th. Simulation of Components of a Thermal Power Plant. The Modelica Association, pp. 119-125. Özgür-hQODNÕQ '. & Bilgiç, T., 2006. Predictive maintenance using Dynamic Probabilistic Networks. Probabilistic Graphical Models, pp. 239-246. Marquez, D., Neil, M. & Fenton, N., 2007. A new approach to reliability modeling. Needham, C. J., Bradford, J. R., Bulpitt, A. J. & Westhead, D. R., 2007. Inference in Bayesian networks. Nature Biotechnology, Volume 2. Akturk, M. S. & Gurel, S., 2007. Machining conditions - based preventive maintenance. International Journal Of Production Research Vol45, No.8, pp. 1725-1743. M.C.Eti, S.O.T.Ogaji & Probert, S., 2007. Integrating reliability, availability, maintainability and supportability with risk analysis for improved operation og the AFAM thermal power plant. Applied Energy 84, pp. 202-221. Özgür-hQODNÕQ '. 2008. Maintenance Of Multi Component Dynamic System Under Partial Observations. Ph.D. Thesis in %R÷D] LoL University. Neil, M., Hager, D. & Andersen, L. B., 2009. Modeling Operational Risk In Financial Institutions Using Hybrid Dynamic Networks. Ibargüengoytia, P. H. & Flores, A. R. a. Z., 2009. Probabilistic Intelligent Systems for Thermal Power Plants. Jesus, F., Carazas, G., Francisco, G. & Souza, M. D., March 2009. Availability Analysis of Gas Turbines Used in Power Plants. Int. J. of Thermodynamics, pp. 28-37. 100 Özgür-hQODNÕQ '. %LOJLo 7. 12. An aggregation and disaggregation procedure for the maintenance of dynamic system under partial observations. 5th European Workshop On Probabilistic Graphical Models. Varuttamaseni, A., Lee, J. C. & Youngblood, R. W., 2011. Bayesian Network Representing System Dynamics In Risk Analysis Of Nuclear Systems. International Topical Meeting on Probabilistic Safety Assessment and Analysis 2011, PSA 2011, pp. 547-558. Casini, L., Illari, P. M., Russo, F. & Wiliamson, J., 2011. Models For Prediction Explanation And Control Recursive Bayesian Network. Obodeh, O. & Esabunor, T., August 2011. Reliability assessment of WRPC gas turbine power. Journal of Mechanical Engineering Research Vol. 3(8),, pp. 286-292. Obodeh, O. & Isaac, F. O., 2011. Investigation of Performance Characteristics of Diesel Engine Fuelled with Diesel-Kerosene Blends. Journal of Emerging Trends in Engineering and Applied Sciences (JETEAS) 2 (2), pp. 318-322. S. Gupta, P. C. T. K. S., 2011. Development of simulation model for performance evaluation. J. Ind. Eng. Int., 7 (12),, pp. 1-9. Singh, G. & Chauhan, D., 2011. Simulation and Modeling of Hydro Power Plant to Study Time Response during Different Gate States. International Journal Of Advanced Engineering Sciences And Technologies, pp. 042 - 047. Pearl, J., 2011. Bayesian networks. Department of Statistics Papers, Department of Statistics, UCLA, UC Los Angeles. Rusin, A. & Wajacazek, A., 2012. Optimization Of Power Machines Maintenance Intervals. Özgür-hQODNÕQ '. %LOJLo 7. 11. Repair decisions for reliability centered maintenance. FLINS 2012, Istanbul. 101 Guptaa, S. & Tewari, P. C., n.d. Simulation Model for Stochastic Analysis and Performance Evaluation of Condensate System of a Thermal Power Plant. Bangladesh Journal Of Scientific And Industrial Research. 102 Other Sources Oechslin, K., 1948. Thermal Power Plant. Gomes, G. S. & Gudwin, R. R., n.d. An Intelligent System for the Predictive Maintenance of a Hydroelectric Power Plant. Cotaina, N. et al., 2000. Study of existing Reliability Centered Maintenance (RCM) approaches used in different industries. Paris (France), Madrid (Spain). ICJT, 2001. [Online] Available at: www.icjt.org/an/tech/jesvet/jesvet.htm [Accessed 25 11 2012]. Agency, F. d., 2010. Report for Thermal Power Plant Investment in Derinçay Basin, Bingol: Firat development Agency. (h$ù (OHFWULFLW\ *HQHUDWLRQ &RPSDQ\( 12. ANNUAL REPORT, 7XUNH\ (h$ù (Electricity Generation Company). Anon., 2011. THERMAL POWER PLANTS: Centre for Science and Environment. Ali, D. M. F., 24. Chapter 02: Nuclear Power Reactors – Components. [Online] Available at: http://intuitech.biz/nuclear-power-reactors-components/ [Accessed 25 11 2012]. Anon., 2012. Electrical Engineering Toturials. [Online] Available at: http://www.powerelectricalblog.com/2007/04/nuclear-power-planttypes- advantages-and.html [Accessed November 2012]. Kabiruddin, M., 2009. VARIOUS COMPONENTS OF POWER PLANT GENIE, Graphical Network Interface, Available at: http://genie.sis.pitt.edu/ 103 BNT, Bayes net toolbox, Available at: https://code.google.com/p/bnt/