Failure Mode and Effects Analysis (FMEA)
Failure Mode and Effects Analysis (FMEA) is a methodology that has played a crucial role in ensuring the safety and reliability of complex systems. In this blog post, we will delve into the history and development of FMEA, its integration into functional safety practices, and the benefits and limitations of this approach.
The roots of Failure Mode and Effects Analysis can be traced back to the mid-20th century when it was initially used in the aerospace industry. In the 1940s, the United States military began implementing FMEA as part of its quality control procedures for military equipment. The main objective was to identify potential failures in various systems, components, and processes and address them before they could cause catastrophic consequences.
Over the years, FMEA gained recognition for its effectiveness in reducing risks and improving safety across various sectors. By the 1960s, industries such as automotive, electronics, and healthcare began adopting FMEA methodologies to enhance product design and manufacturing processes.
Failure Mode And Effects Analysis Methodology
- Systematic Approach: FMEA is a structured and systematic approach for identifying and analysing potential failures, their effects, and the associated risks. It involves multidisciplinary teams that evaluate components, subsystems, and entire systems to identify potential failure modes.
- Severity, Occurrence, and Detection: FMEA assesses failures based on three main factors: severity, occurrence, and detection. Severity refers to the potential impact of a failure on safety, operations, or the environment. Occurrence indicates the probability of a failure happening, while detection represents the likelihood of detecting the failure before it leads to harm.
- Risk Prioritisation: Once the failure modes are identified and evaluated, they are prioritized based on a risk priority number (RPN). The RPN is calculated by multiplying the severity, occurrence, and detection ratings, providing a numerical value that helps prioritize risk mitigation efforts. The RPN method of evaluation has been replaced by action priority table in the AIAG-VDA FMEA handbook, 4th Edition.
- Mitigation Strategies: FMEA enables identifying and implementing appropriate mitigation strategies to reduce or eliminate identified risks. These strategies can include design changes, process improvements, redundancy, enhanced testing, and other measures to increase system reliability and safety.
Integration With Functional Safety
Functional safety refers to the capability of a system or equipment under control (EuC) to operate correctly in response to its inputs, even in the presence of potential faults or failures. FMEA plays a vital role in achieving functional safety objectives by identifying potential failure modes and their effects on the system’s intended functionality. FMEA is crucial for safety from the two perspectives below.
Safety Standards: Various safety standards, such as ISO 26262 for automotive functional safety and IEC 61508 for general industrial applications, require the application of FMEA as part of the safety assessment process. There are no explicit guidelines on how to apply FMEA in any of these above-mentioned standards but is referenced as a technique to apply.
Safety Lifecycle: The foundation principles of FMEA are integrated into a system’s safety lifecycle, which includes activities such as hazard analysis, safety requirements specification, verification, and validation. FMEA helps identify and address potential hazards and risks throughout the lifecycle, enabling the development of robust safety measures.
Benefits and Limitations of FMEA
Benefits
- Early Risk Identification: FMEA facilitates early identification of potential failures, allowing for proactive risk mitigation and prevention.
- Improved Design: By identifying failure mode and their causes, FMEA promotes design improvements to enhance system reliability and safety.
- Cost Reduction: FMEA helps avoid expensive recalls, or accidents by addressing potential failures during the design and development stages.
- Enhanced Safety Culture: FMEA fosters a safety-conscious culture within organisations, encouraging a systematic approach to risk management.
Limitations
- Subjectivity: FMEA involves subjective judgments and ratings, which can introduce variability in the analysis results.
- Incomplete Analysis: If not conducted thoroughly, FMEA may miss potential failure modes or their effects, compromising the overall effectiveness. This is often attributed to poor quality inputs for FMEA.
- Time-Consuming: FMEA can be time-consuming, particularly for complex systems requiring dedicated resources and expertise.
- Single point failures: FMEA allows for the analysis of risks associated with single point failures.
FMEA techniques play a vital role in ISO 26262 Part 4 for the development of Technical Safety Concepts (TSC) and the identification of systematic failures as well as random hardware failures.
The utilisation of FMEA can be done independently as a standalone activity, as a designated component of the safety development work breakdown structure or derived from an existing system-level FMEA. When conducting the Safety Analysis as an independent activity, there may not be a need for a detailed approach involving Severity, Occurrence, and Risk prioritisation. The primary focus of this analysis is to identify failures that can potentially violate safety goals.
Throughout the development of TSC in accordance with ISO 26262 Part 4, the analysis expands to encompass the System Architecture. This includes the examination of system elements, their associated functions, and the interfaces between hardware and software components. The primary objective of this analysis is to specifically address single-point failures. Existing design mechanisms that have been devised for failure detection, hardware protection, or degradation strategies can be repurposed and assigned the appropriate Automotive Safety Integrity Level (ASIL) or Safety Integrity Level (SIL) as Safety Mechanisms.
A collection of safety mechanisms derived from FMEA-type analysis can undergo further evaluation using Fault Tree Cut Set Analysis. The aim of this analysis is to confirm that no single-point failures can lead to a violation of safety goals. This approach serves as a verification method to assess the completeness and consistency between the FMEA and Fault Tree Analysis (FTA). It ensures that potential failure modes identified in the FMEA are appropriately addressed and validated in the FTA, resulting in a comprehensive safety evaluation. (Refer:FTA Blogpost).
The relationship between both types of analysis can be demonstrated through an example (Refer Figure-1). A single point failure – “Cell voltage acquisition and balancing fault” is identified using FMEA. This identified failure then serves as the foundational event in the FTA for Safety goals, specifically addressing the occurrence of a Thermal Event.
Figure 1 – Relationship between FMEA and FTA
Conclusion
Failure Mode and Effects Analysis (FMEA) has evolved over time from its origins in the aerospace industry to become a widely adopted tool for ensuring functional safety in various sectors. With its structured approach, FMEA helps identify potential failures, assess their severity and occurrence, and prioritize risk mitigation efforts. By integrating FMEA into the safety lifecycle, industries can enhance system reliability, reduce risks, and foster a safety-conscious culture. Despite its limitations, FMEA remains an invaluable tool in the pursuit of safety and reliability in designing and developing complex systems.
For further information on how to apply FMEA or to discuss safety analysis and risk management at your organisation, get in touch with us at enquiries@3sk.co.uk