TY - JOUR
T1 - Dependability in Embedded Systems
T2 - A Survey of Fault Tolerance Methods and Software-Based Mitigation Techniques
AU - Solouki, Mohammadreza Amel
AU - Angizi, Shaahin
AU - Violante, Massimo
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2024
Y1 - 2024
N2 - Fault tolerance is a critical aspect of modern computing systems, ensuring correct functionality in the presence of faults. This paper presents a comprehensive survey of fault tolerance methods and mitigation techniques in embedded systems, with a focus on both software and hardware faults. Emphasis is placed on real-time embedded systems, considering their resource constraints and the increasing interconnectivity of computing systems in commercial and industrial applications. The survey covers various fault tolerance methods, including hardware, software, and hybrid redundancy. Particular attention is given to software faults, acknowledging their significance as a leading cause of system failures, while also addressing hardware faults and their mitigation. Moreover, the paper explores the challenges posed by soft errors in modern computing systems. The survey concludes by emphasizing the need for continued research and development in fault tolerance methods, specifically in the context of real-time embedded systems, and highlights the potential for extending fault tolerance approaches to diverse computing environments.
AB - Fault tolerance is a critical aspect of modern computing systems, ensuring correct functionality in the presence of faults. This paper presents a comprehensive survey of fault tolerance methods and mitigation techniques in embedded systems, with a focus on both software and hardware faults. Emphasis is placed on real-time embedded systems, considering their resource constraints and the increasing interconnectivity of computing systems in commercial and industrial applications. The survey covers various fault tolerance methods, including hardware, software, and hybrid redundancy. Particular attention is given to software faults, acknowledging their significance as a leading cause of system failures, while also addressing hardware faults and their mitigation. Moreover, the paper explores the challenges posed by soft errors in modern computing systems. The survey concludes by emphasizing the need for continued research and development in fault tolerance methods, specifically in the context of real-time embedded systems, and highlights the potential for extending fault tolerance approaches to diverse computing environments.
KW - analytical redundancy
KW - dependability
KW - embedded systems
KW - fault tolerance
KW - reliability
UR - http://www.scopus.com/inward/record.url?scp=85211234425&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85211234425&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2024.3509633
DO - 10.1109/ACCESS.2024.3509633
M3 - Article
AN - SCOPUS:85211234425
SN - 2169-3536
JO - IEEE Access
JF - IEEE Access
ER -