Understanding Fault Tree Analysis

Introduction

A Fault Tree is a graphical representation of a collection of events and conditions that lead to an undesired event or system failure.

It is a powerful methodology that assesses the various causes and consequences that can lead to system failure. This blog aims to give readers an overview of Fault trees, explore their components, and highlight their significance in safety risk analyses.

Fault Tree Analysis

Before we start exploring Fault trees and their application, let’s take a short look into the history of Fault trees. The concept of Fault trees originated in the field of nuclear engineering during the 1960s. Under the leadership of H.A. Watson, Bell Laboratories was at the forefront of developing fault tree analysis to evaluate the safety of complex systems.

The motivation behind this innovation was to understand the causes of system failures and identify critical areas that required improvement. These techniques were later used by Boeing in the Minuteman missile programme. (https://ieeexplore.ieee.org/document/4740894).

Components of a Fault Tree Analysis

Events: A fault tree is comprised of different events that represent either a basic or a higher-level occurrence within a system. Basic events signify the lowest-level causes of a failure, while higher-level events are a combination of basic and other intermediate events. Relationships between events are established through logical gates.
Logical Gates: Logical gates play a crucial role in fault tree analysis, as they define the relationships between events. The most common types of gates used in Fault trees are AND gates, OR gates, and NOT gates. An AND gate signifies that all input events must occur simultaneously for the top event to happen. Conversely, an OR gate implies that at least one of the input events must occur. NOT gates represent negation, allowing the engineers to model negative conditions.
Top Event: At the apex of a fault tree lies the top event, which represents the system failure being analysed. It is the outcome of multiple events and logic. Understanding the causes and probabilities associated with the top event is crucial for assessing system reliability.

Figure 1 – Operator Symbols * Ćatić et al (2014), Fault tree analysis of hydraulic power-steering system. International Journal of Vehicle Design.

Approaches of Fault Tree Analysis

Fault trees can be created by inductive and deductive methods. We need to explore both approaches, examining their characteristics, applications, and advantages to gain a comprehensive understanding of their role in fault tree analysis.

Inductive Method in Fault Tree Analysis

The inductive method involves building a fault tree based on available data and components “in use history”. It begins with the identification of basic events, which are the lowest-level causes of a system failure.

Data Collection: The first step in the inductive method is to collect relevant data regarding system failures. This data can come from various sources, such as historical records, incident reports, expert opinions, and simulations. The goal is to gather a comprehensive dataset that covers a wide range of failure scenarios and associated events.
Basic Event Identification: Once the data is collected, the engineer identifies the basic events by examining the data and extracting the specific failure causes. These events represent the specific conditions or component failures that contribute to the system failure. They are then arranged in a logical structure using gates, such as AND, OR, and NOT gates, to depict the relationships between events.
Probabilistic Assessment: To assign probabilities to the basic events, statistical analysis or expert judgment is often employed. The probabilities can be derived from historical data, expert opinions, or established industry standards. This step is crucial in quantifying the likelihood of each basic event occurring and helps in understanding the overall system reliability.

Deductive Method in Fault Tree Analysis

The deductive method in fault tree analysis starts with the top event, which represents the system failure under investigation. It involves breaking down the top event into its contributing events and determining their relationships and probabilities. The deductive method is often used when the system under study is well-defined, and the failure modes are known.

Top Event Identification: This event is typically derived from system specifications, design documents, or safety requirements. The top event is then broken down into its contributing events using logical gates, representing the relationships between the events.
Event Decomposition: The engineer decomposes the top event by identifying the contributing events that lead to its occurrence. These events can be either basic events or intermediate events, depending on their level in the fault tree. The decomposition process continues until all contributing events are identified, forming a hierarchical structure of events.

Figure 2 – Fault Tree – Example

Understanding Functional Architecture

One crucial aspect of developing an accurate and effective fault tree is the establishment of a robust functional architecture. It refers to the hierarchical representation of a system’s functions and their interdependencies. It provides a structured framework to comprehend the system’s behaviour and the relationships between various functional elements. A well-defined functional architecture forms the foundation for fault tree analysis, aiding in the identification and analysis of failure modes. Figure 3 – Functional Architecture

Importance of Functional Architecture in Fault Tree Analysis

System Understanding: By defining and organising the system’s functions and their relationships, the functional architecture allows engineers to gain insights into the system’s behaviour, failure modes, and critical components.
Event Identification: Each function within the architecture represents a potential failure mode or a contributing event to system failure. By examining the interdependencies between functions, an engineer can identify critical events and their relationships, enabling the construction of Fault trees that encompass all significant failure paths.
Relationship Establishment: The hierarchical structure of the functional architecture provides guidance for the logic gates used to depict the relationships between events.
Failure Mode Propagation: By analysing the flow of functions and their dependencies, the determination of failure in a component or function can propagate and affect other parts of the system. This knowledge allows for the accurate representation of failure paths in Fault trees, capturing the potential cascading effects and enabling a comprehensive reliability assessment.

Challenges and Considerations

While functional architecture is crucial in fault tree analysis, it also presents challenges that need to be addressed:

Complexity: Developing a comprehensive functional architecture can be complex, particularly for large and intricate systems. Analysing the system’s functions, their interactions, and failure modes requires careful consideration and expertise.
Accuracy of the Architecture: The accuracy of the functional architecture directly impacts the quality of the fault tree analysis. Errors or inaccuracies in the architecture can lead to incorrect event identification, improper relationship establishment, and ultimately, flawed fault tree construction. Therefore, meticulous attention to detail and validation of the functional architecture are vital.
Integration with Design Process: Functional architecture should be integrated into the system design process to ensure that it reflects the intended system behaviour and operation. Collaboration between design and functional safety engineers is crucial to accurately capture the functional relationships and potential failure modes.

Synergy Between FMEA and Fault trees: Enhancing Risk Analysis

In the realm of engineering and risk analysis, two powerful techniques stand out: Failure Modes and Effects Analysis (FMEA) and Fault Tree Analysis (FTA). While each method has its distinct purpose and approach, their relationship is symbiotic, as they complement and strengthen each other in assessing and mitigating risks. Failure of the function often forms the base event of a fault tree without further examining the typology of failure. A function failure can be represented in a fault tree as a base event, but the type of failure and possible causes are examined in FMEA.

Understanding Failure Modes and Effects Analysis (FMEA): The Relationship between FMEA and Fault trees

Event Identification: FMEA provides valuable input to the event identification process in Fault trees. FMEA captures the potential failure modes at the component level, while Fault trees allow for the systematic identification and modelling of the events that contribute to system failures.
Decision-Making and Mitigation Strategies: The integrated use of FMEA and Fault trees aid decision-making and the development of effective mitigation strategies. FMEA provides insights into potential failure modes and their causes, while Fault trees reveal the relationships between events and the systemic implications of failures. Together, they enable engineers and decision-makers to identify critical components, failure paths, and the most effective strategies for risk reduction, system improvement, and reliability enhancement.
Validation of Results: Cross-validation of results between FMEA and Fault tree Analysis is essential to identify any discrepancies or inconsistencies. The base events of a fault tree should be derived from a FMEA. Without understanding the base events, fault tree construction can be a lost cause.

Summary

Fault tree analysis is one of the most reliable methods for identifying the causes of failure events in safety-critical systems.
A technically sound functional architecture is essential to develop robust Fault trees.
Ensuring traceability through a common set of functions and their failures between FMEA and Fault tree, is vital for correctness, completeness,s and consistency between both analyses. By aligning functions upfront and fostering expert collaboration, the analysis provides a comprehensive understanding of failure modes, causes, and consequences.

In our next blog, we will examine FMEAs and how to construct one to support the identification of single-point failures along with dependent failure analysis.

For information on how to apply Fault trees or to discuss safety analysis and risk management at your organisation, get in touch with us at enquiries@3sk.co.uk

Get in touch

Find out more about how 3SK can support and invigorate your systems and processes with our audit, consultancy, system development and training services by getting in touch with our expert team today.

Get in touch