A Safety Analysis Method for Control Software in Coordination with FMEA and FTA
Abstract
:1. Introduction
2. Related Works
3. Outline of the Proposed Method
3.1. Outline of the Development Method for a Safer CSW
3.2. Outline of FMEA
3.3. Outline of FTA
3.4. Safety Analysis Support Environment
4. Application and Evaluation
4.1. Outline of the Target System
4.2. Design Information on the CSW
4.3. Safety Analysis Results Related to CSW
4.4. Benefits of the Proposed Method
4.5. List of Limitations
4.5.1. The Issue of the CSW’s Size
4.5.2. The Issue Related to the Conflicts between Countermeasures and the Addition of New Risk
4.5.3. The Issue of Attacks
4.5.4. The Issue of Using an Object-Oriented Programming Language
4.5.5. The Issue of Other Safety Analysis Methods
5. Conclusions and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- WIRED. Google’s Self-Driving Car Caused Its First Crash. 2016. Available online: https://www.wired.com/2016/02/googles-self-driving-car-may-caused-first-crash (accessed on 19 January 2021).
- Leveson, N.; Tuner, C. An Investigation of the Therac-25 accidents. IEEE Comput. 1993, 26, 18–41. [Google Scholar] [CrossRef]
- Japan Aerospace Exploration Agency (JAXA). Operation Plan of X-ray Astronomy Satellite ASTRO-H (Hitomi). 2016. Available online: https://global.jaxa.jp/press/2016/04/20160428_hitomi.html; https://www.jaxa.jp/press/2016/05/files/20160524_hitomi_01_j.pdf (accessed on 19 January 2021). (detailed report; (In Japanese)).
- International Electro Technical Commission. International IEC Standard ICE 60812:2006, Analysis Techniques for System Reliability–Procedure for Failure Mode and Effects Analysis (FMEA); International Electrotechnical Commission: Geneva, Switzerland, 2006. [Google Scholar]
- International Electro Technical Commission. International IEC Standard ICE 61025:2006, Fault Tree Analysis; International Electrotechnical Commission: Geneva, Switzerland, 2006. [Google Scholar]
- Hatley, D.; Pirbhai, I. Strategies for Real-Time System Specification; Dorset House Publication: New York, NY, USA, 1988. [Google Scholar]
- Information-Technology Promotion Agency, Japan. Software Reliability Enhancement Center, White Paper of Embedded Software Development 2017; Information-Technology Promotion Agency Japan: Tokyo, Japan, 2017. (In Japanese) [Google Scholar]
- International Electro Technical Commission. International IEC Standard ICE 61508, Functional Safety of Electrical/Electronic/Programmable Electronic Safety-Related Systems; International Electrotechnical Commission: Geneva, Switzerland, 2010. [Google Scholar]
- International Organization for Standardization. ISO26262 Road Vehicles–Functional Safety; International Organization for Standardization: Geneva, Switzerland, 2011. [Google Scholar]
- International Electro Technical Commission. International IEC Standard ICE 61882:2001 Hazard and Operability Studies; International Electrotechnical Commission: Geneva, Switzerland, 2016. [Google Scholar]
- International Electro Technical Commission. International IEC Standard ICE 82304-1 Health Software—Part 1: General Requirements for Product Safety; International Electrotechnical Commission: Geneva, Switzerland, 2016. [Google Scholar]
- International Society for Pharmaceutical Engineering. GAMP5 A Risk-Based Approach to Compliant GxP Computerized Systems; International Society for Pharmaceutical Engineering: North Bethesda, MD, USA, 2008. [Google Scholar]
- Takahashi, M.; Nanba, R.; Fukue, Y. A proposal of operational risk management method using FMEA for drug manufacturing computerized system. Trans. Soc. Instrum. Control Eng. 2012, 48, 285–294. (In Japanese) [Google Scholar] [CrossRef] [Green Version]
- Morita, M. Reduction of Software Bugs using FMEA, Japanese Software Quality Control -Case Studies. Union Jpn. Sci. Eng. 1990, 1990, 461–486. (In Japanese) [Google Scholar]
- Niwa, M. Improvement of Reliability for System Design using FMEA, Japanese Software Quality Control -Case Studies-. Union Jpn. Sci. Eng. 1990, 1990, 467–475. (In Japanese) [Google Scholar]
- Goddard, L. Validating the Safety of Embedded Real-Time Control Systems Using FMEA. In Proceedings of the Annual Reliability and Maintainability Symposium, Atlanta, GA, USA, 26–28 January 1993; pp. 227–230. [Google Scholar]
- Snooke, N.; Price, C. Model-driven automated software FMEA. In Proceedings of the Reliability and Maintainability Symposium, Reliability and Maintainability Symposium, Lake Buena Vista, FL, USA, 24–27 January 2011; pp. 1–7. [Google Scholar]
- Lazarus, K.; Nggada, S. Software Failure Analysis using FMEA. Int. J. Softw. Eng. Appl. 2018, 12, 19–28. [Google Scholar] [CrossRef]
- Kim, H. SW FMEA for ISO-26262 Software Development. In Proceedings of the APSEC2014, 2014 21st Asia-Pacific Software Engineering Conference (APSEC), Santiago, Chile, 1–5 November 2014; pp. 1–5. [Google Scholar]
- Batbayar, K. Medical device software risk assessment using FMEA and fuzzy linguistic approach: Case study. In Proceedings of the SACI, 2016 IEEE 11th International Symposium on Applied Computational Intelligence and Informatics (SACI), Timisoara, Romania, 12–14 May 2016; pp. 1–5. [Google Scholar]
- Yang, S.; Bian, C.; Li, X.; Tan, L.; Tang, D. Optimized fault diagnosis based on FMEA-style CBR and BN for embedded software system. Int. J. Adv. Manuf. Technol. 2018, 94, 3441–3453. [Google Scholar] [CrossRef]
- Weber, W.; Tondok, H.; Bachmayer, M. Enhancing Software Safety by Fault Trees: Experiences from an Application to Flight Critical SW. In Proceedings of the SAFECOMP2003, SAFECOMP, Edinburgh, UK, 24–26 September 2003; pp. 289–302. [Google Scholar]
- Friedman, M. Automated Software Fault Tree Analysis for Pascal Program. In Proceedings of the Annual Reliability and Maintainability Symposium, Atlanta, GA, USA, 26–28 January 1993; pp. 458–461. [Google Scholar]
- Leveson, N.; Harvey, R. Analyzing Software Safety. IEEE Trans. Softw. Eng. 1983, 9, 569–579. [Google Scholar] [CrossRef]
- Takahashi, M.; Nanba, R. A Proposal of Fault Tree Analysis for Control Programs. In Proceedings of the SICE 2014, Hokkaido, Japan, 9–12 September 2014; The Society of Instrument and Control Engineers: Tokyo, Japan, 2014; pp. 1719–1724. [Google Scholar]
- Kumar, K.; Ramaiah, P. An Experimental Safety Analysis using FTA for A Ball Position Control System. J. Netw. Inf. Secur. 2016, 4, 1–7. Available online: http://www.publishingindia.com (accessed on 28 December 2020).
- Oveisi, S.; Farsi, M.; Kamandi, A. Design Safe Software via UML-based SFTA in Cyber Physical Systems. J. Appl. Intell. Syst. Inf. Sci. 2020, 1, 11–23. [Google Scholar]
- Junga, S.; Yoo, J.; Lee, Y. A Software Fault Tree Analysis Technique for Formal Requirement Specifications of Nuclear Reactor Protection Systems. Reliab. Eng. Syst. Saf. 2020, 203, 107064. [Google Scholar]
- Hansen, K.; Wells, L.; Maier, T. HAZOP Analysis of UML-Based Software Architecture Descriptions of Safety-Critical. In Proceedings of the NWUML 2004, 2nd Nordic Workshop on the Unified Modeling Language, Turku, Finland, 19–20 August 2004; pp. 59–78. [Google Scholar]
- Guiochet, J. Hazard Analysis of Human–Robot Interactions with HAZOP–UML; Safety Science; Elsevier: Amsterdam, Nederland, 2016; Volume 84, pp. 225–237. [Google Scholar]
- Kaleeswaran, A.; Munk, P.; Sarkic, S.; Vogel, T.; Nordmann, A. A Domain Specific Language to Support HAZOP Studies of SysML Models. In Proceedings of the International Symposium on Model-Based Safety and Assessment, IMBSA 2019: Model-Based Safety and Assessment, Tessaloniki, Greece, 16–18 October 2019; pp. 47–62. [Google Scholar]
- Abdulkhaleq, A.; Wagner, S. Experiences with Applying STPA to Software-Intensive Systems in the Automotive Domain. In Proceedings of the 2013 STAMP Workshop, Cambridge, MA, USA, 26–28 March 2013; pp. 1–3. [Google Scholar]
- Yang, C. Software Safety Testing Based on STPA. Procedia Eng. 2014, 80, 399–406. [Google Scholar] [CrossRef] [Green Version]
- Nakao, H.; Katahira, M.; Miyamoto, Y.; Leveson, N. Safety Guided Design of Crew Return Vehicle in Concept Design Phase Using STAMP/STPA. In Proceedings of the 5th IAASS Conference A safer space for a safer world’, The 5th IAASS Conference A safer space for a safer world, Versailles, France, 17–19 October 2011; pp. 1–5. [Google Scholar]
- Hong, Z.; Binbin, L. Integrated Analysis of Software FMEA and FTA. In Proceedings of the 2009 International Conference on Information Technology and Computer Science, 2009 International Conference on Information Technology and Computer Science, Kiev, Ukraine, 25–26 July 2009; pp. 184–187. [Google Scholar]
- Oveisi, S.; Ravanmehr, R. Analysis of software safety and reliability methods in cyber physical systems. Int. J. Crit. Infrastruct. 2017, 13, 1–15. [Google Scholar] [CrossRef]
- International Organization for Standardization. ISO/IEC 2382-14: 1997, Information Technology–Vocabulary–Part 14: Reliability, Maintainability and Availability; International Organization for Standardization: Geneva, Switzerland, 1997. [Google Scholar]
- Takahashi, M.; Anang, Y.; Watanabe, Y. A Proposal of Fault Tree Analysis for Embedded Control Software. Information 2020, 11, 402. [Google Scholar] [CrossRef]
- Space System Safety and Mission Guarantee Committee. Space System Safety and Mission Guarantee; Chapter 5 Software Safety and Mission Guarantee; Union of Japanese Scientists and Engineers: Tokyo, Japan, 2016; pp. 203–214. (In Japanese) [Google Scholar]
- Hollnagel, E. Barriers and Accident Prevention; Tayer & Francis: New York, NY, USA, 2004. [Google Scholar]
Group | Common Failure Mode | Failure Example | Countermeasure Policy | Standard Countermeasures |
---|---|---|---|---|
Startup | The startup conditions for functions are not prepared | Related operations cannot be conducted, an improper system status exists | Review the startup conditions | Add the confirmation procedure for the startup conditions to the Standard Operation Procedure (SOP), set the conditions as to whether or not to start |
Conduct multiple checks when startup | ||||
Conduct the startup check | Display the startup status | |||
Termination | The termination conditions for functions are not prepared | Related operations cannot be conducted, an improper system status exists | Review the termination conditions for functions | Add the confirmation procedure for termination conditions to the SOP, set the conditions whether or not to terminate, multiplex the termination confirmation procedure |
Conduct multiple checks upon termination | ||||
Conduct termination check | Display the termination status | |||
Transit to the safe status for top priority | Add the emergency stop function | |||
Input/Output | Instructions on SOP misread | Improper results are calculated, an improper system status exists | Conduct multiple checks on SOP | Conduct double checks on SOP |
Improve the visibility of SOP indications | Integrate the SOP format | |||
Indications on Human Machine Interface (HMI) misread | Improper results are calculated, an improper system status exists | Conduct multiple checks on HMI | Conduct double checks on HMI | |
Improve the visibility of HMI | Integrate the HMI format | |||
Check the content of HMI | Add the reconfirmation function | |||
Mistake in checking products | Improper results are calculated | Conduct multiple checks on products | Conduct double checks on products | |
Past data are lost | Data related to quality are lost | Notify when data are lost | Add a warning function for past data loss | |
Latest data are lost | Notify if there is a data loss risk | Add a warning function for the latest data loss | ||
An inputting error | Improper results are calculated, an improper system status exists | Multiple checks on input data | Conduct double checks on setting data | |
Calibration | Long time intervals for function calibration | A wrong measurement is done, improper results are calculated | Conduct periodic reviews | Shorten time intervals for function calibration |
Qualification | Wrong operation authority | Proper operations cannot be done, improper results are calculated | Confirm the qualification before operation | Confirm operation authority before operation |
Do not set improper authority | Review authority periodically | |||
Backup | Insufficient backup | Data disappear, data related to quality are lost | Conduct proper backup operations | Organize the backup procedure in the SOP |
Shorten backup intervals | Shorten backup time intervals | |||
Unexpected CPU Load | Unexpected data update occurs | Data cannot be updated | Realize faster processing | Realize faster update processing |
Develop faster devices | Install faster memory devices | |||
The upper limit of calculation precision is confirmed | Improper results are calculated | Increase significant digits | Utilize double-precision variables | |
The lower limit of calculation precision is confirmed | Improper results are calculated | Increase significant digits | Utilize double-precision variables | |
Divided by zero | Operation is suspended | Give a warning of division by zero | Add a warning function for a small divisor | |
Unexpected amount of data is accepted | Abnormal program shutdown | Refuse data | Add a restriction function for available data | |
Do not input data | Add the number of available data to the SOP | |||
Unexpected interruption tasks occur | Restrict interruption tasks | Restrict interruption tasks | ||
Prohibit interruption tasks | Add the restriction function for interruption tasks to the SOP | |||
Unexpected CPU load occurs | Program does not response, a slow response | Unexpected execution requests are not sent | Add the function of displaying CPU usage | |
Refuse unexpected execution requests | Add the restriction function for accepting execution requests under CPU overload | |||
Malicious operations or attacks | No identification for important data | Data are removed | Take measures so that data are not removed | Introduce Data Loss Prevention (DLP) tools |
No access control for data | Take measures so that data are not accessed | Add access control for data according to each user | ||
Data could be rewritten | Data are falsified | Take measures so that data are not falsified | Add e-signature, add time stamp | |
Vast amounts of data sent | Related operations cannot be conducted | Data acceptance is blocked | Disconnect from the external network | |
Vast amounts of requests sent | Data are selected | Install fire walls | ||
Illegally accessed from the outside | System is invaded | Disconnect | Disconnect from the external network | |
Discover illegal access | Introduce Intrusion Detection System (IDS), Introduce Intrusion Prevention System (IPS) | |||
Data with virus attached are received | System malfunctions, improper results are calculated | Remove computer virus | Introduce antivirus software | |
Take measures so that virus does not invade | Introduce virus protection software | |||
Conduct virus check on USB memory devices connected |
Function | Common Failure Mode | Impact to System | Accept/Reject | Severity | Incidence | Risk Class | Detection Rate | Priority | Countermeasures |
---|---|---|---|---|---|---|---|---|---|
Function A | Startup condition X is not prepared. | The machine does not work. | Accept | Middle | Low | 3 | High | Low | Add Standard Operation Procedure (SOP) for Checking Startup Conditions. |
Middle | Low | 3 | High | Low | |||||
Termination condition Y is not prepared. | The machine use electric power continuously. | Accept | Middle | Low | 3 | High | Low | Add SOP for checking termination Condition. | |
Middle | Low | 3 | High | Low | |||||
- - - - | - - - - | - - - - | -- | -- | -- | -- | -- | -- | - - - - |
Process ID | Function | Common Failure Modes | Causes | Impact to System | Severity | Probability | Detection rate | Risk | Countermeasures |
---|---|---|---|---|---|---|---|---|---|
3.1 | increment stack pointer for command count (conducted by BUS61553B) | malfunction of hardware | malfunction of BUS61553B | cannot receive commands | High | Low | Low | High | Not Applicable (NA) (use high reliability parts) |
3.2 | store necessary info to descriptor stack (conducted by BUS61553B) | malfunction of hardware | malfunction of BUS61553B | cannot receive commands | High | Low | Low | High | NA (use high reliability parts) |
3.3 | store data blocks (conducted by BUS61553B) | malfunction of hardware | malfunction of BUS61553B | cannot receive commands | High | Low | Low | High | NA (use high reliability parts) |
3.4 | confirm received command status | activation conditions failure | no sync data | cannot receive commands | High | Low | Low | High | multiplexing timer interruption, multiplexing descriptor pointer, multiplexing descriptor pointer (Spec. of BUS61553B) |
completion conditions failure | cannot access descriptor stack | ||||||||
inadequate input data | error of descriptor stack | ||||||||
inadequate output data | command reception error on | ||||||||
inadequate algorithm | NA | NA | - | - | - | - | - | ||
program destruction | malfunction of memory | cannot receive commands | High | Low | Low | High | multiplexing error bit for command reception | ||
back up error | NA | NA | - | - | - | - | - | ||
security error | NA | NA | - | - | - | - | |||
operation procedure miss | NA | NA | - | - | - | - | |||
malfunction of hardware | malfunction of Electronic Control Unit (ECU) | NA | - | - | - | - | |||
3.5 | extract data blocks | activation conditions failure | cannot finish process 3.4 | cannot receive commands | High | Low | Low | High | countermeasure in process 3.4 |
completion conditions failure | cannot access data blocks | cannot receive commands | High | Low | Low | High | multiplexing data block pointer (Spec. of BUS61553B) | ||
inadequate input data | data block error | cannot conduct action corresponding to received command | Middle | Low | High | Low | increase retry number of sending command (Spec. of 1553B protocol) | ||
inadequate output data | command reception error | request to send command | Middle | Low | High | Low | |||
inadequate algorithm | NA | NA | - | - | - | - | - | ||
program destruction | malfunction of memory | cannot extract data blocks | High | Low | High | Low | add check bit to data block | ||
back up error | NA | NA | - | - | - | - | - | ||
security error | NA | NA | - | - | - | - | - | ||
operation procedure miss | NA | NA | - | - | - | - | - | ||
malfunction of hardware | malfunction of ECU | NA | - | - | - | - | - | ||
3.6 | conduct action according to received command | activation conditions failure | cannot finish process 3.5 | cannot response for command | High | Low | Low | High | countermeasure in process 3.5 |
completion conditions failure | NA | NA | - | - | - | - | - | ||
inadequate input data | cannot conduct adequate command response | cannot response for command | High | Low | Low | High | add command send request function | ||
inadequate output data | cannot conduct adequate command response | cannot response for command | High | Low | Low | High | add command send request function | ||
inadequate algorithm | NA | NA | - | - | - | - | - | ||
program destruction | NA | NA | - | - | - | - | - | ||
back up error | NA | NA | - | - | - | - | - | ||
security error | NA | NA | - | - | - | - | - | ||
operation procedure miss | NA | NA | - | - | - | - | - | ||
malfunction of hardware | malfunction of ECU | cannot response for command | High | Low | High | Low | use redundant system | ||
3.7 | conduct command response (conducted by BUS61553B) | malfunction of hardware | malfunction of BUS61553B | cannot response for command | - | - | - | - | - |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Takahashi, M.; Anang, Y.; Watanabe, Y. A Safety Analysis Method for Control Software in Coordination with FMEA and FTA. Information 2021, 12, 79. https://doi.org/10.3390/info12020079
Takahashi M, Anang Y, Watanabe Y. A Safety Analysis Method for Control Software in Coordination with FMEA and FTA. Information. 2021; 12(2):79. https://doi.org/10.3390/info12020079
Chicago/Turabian StyleTakahashi, Masakazu, Yunarso Anang, and Yoshimichi Watanabe. 2021. "A Safety Analysis Method for Control Software in Coordination with FMEA and FTA" Information 12, no. 2: 79. https://doi.org/10.3390/info12020079
APA StyleTakahashi, M., Anang, Y., & Watanabe, Y. (2021). A Safety Analysis Method for Control Software in Coordination with FMEA and FTA. Information, 12(2), 79. https://doi.org/10.3390/info12020079