Reliability and Safety Modelling in Reliable Systems Supported with Cold Standby Spares by a Markov Model

Document Type: Research Paper


Department of Electronic and Computer Engineering, Shahid Beheshti University, Tehran, Iran


Fault tolerance is one of the important issues in industries, such as transportation, military, chemical and nuclear sectors. Reliability and safety are two vital attributes of fault tolerant systems, and redundancy is a common way to improve these parameters. This article studies the impact of including cold standby spares in system architecture. To do this, it uses the Markov model of the system due to its capabilities in incorporating and modeling of priorities, functional dependencies, timing relations, reconfiguration issues and events sequences, existed in the system dynamic. The previous studies in the domain of reliable and fault tolerant systems have not presented general formulas to evaluate the reliability and safety of systems supported with cold spares. This article presents closed formulas for these attributes by which the impact of the number of spares, failure rate of modules, coverage factor, and switch quality can be studied on the system performance.


Main Subjects

1-         Ahmad, S. H. (1985). "Reliability of a 3-Unit Cold Redundant System with Weibull Failure." Microelectronics Reliability, Vol. 25, No.2, PP. 325–330.

2-       Amari, S. and Dill, G. (2009). "A New Method for Reliability Analysis of Standby Systems." Reliability and Maintainability Symposium, Fort Worth, TX: Annual Reliability and Maintainability Symposium. PP. 417 - 422

3-       Avizienis, A., Laprie, J.-C., Randell, B. and Landwehr, C. (2004). "Basic Concepts and Taxonomy of Dependable and Secure Computing." IEEE Transactions on Dependable and Secure Computing, Vol. 1, No. 1, PP. 11 - 33.

4-       Chandrasekhar, P., Natarajan, K. and Yadavalli, V. S. (2004). A "Study on a Two Unit Standby System with Relangian Erlanian Repair Time." Asia-Pacific Journal of Operational Research, Vol. 21, No.3, PP. 271-277.

5-       Coit, D. W. (2001). "Cold-Standby Redundancy Optimization for Nonrepairable Systems." Institute of Industrial Engineers (IIE) Transactions, Vol.33, No. 6, PP. 471-478.

6-       Dubrova, E. (2013). Fault-Tolerant Design. New York Heidelberg Dordrecht London, United States of America,UK: Springer.

7-       Dugan, J. B., Bavuso, S. J. and Boyd, M. A. (1992). "Dynamic Fault-Tree Models for Fault-Tolerant Computer Systems." IEEE Transactions on Reliability, Vol. 41, No. 3, PP. 363 - 377.

8-       Guimarães, A. P., Maciel, P. R. and Matias Jr., R. (2013). "An Analytical Modeling Framework to Evaluate Converged Networks Through Business-Oriented Metrics." Reliability Engineering & System Safety, Vol. 118, No.0, PP. 81–92.

9-       Hoang, T., Ross, J., Doyle, S., Rea, D., Chan, E., Neiderer, W. and Bumgarner, A. (2007). "A Radiation Hardened 16-Mb SRAM for Space Applications." Aerospace Conference (PP. 1-6). Big Sky, MT: Aerospace Conference.

10-   Hsu, Y.-T. and Hsu, C.-F. (1991). "Novel Model of Intermittent Faults for Reliability and Safety Measures in Long-Life Computer Systems. " International Journal of Electronics, Vol. 71, No. 6, PP. 917-937.

11-   Jia, J. and Wu, S. (2009). "Optimizing Replacement Policy for a Cold-Standby System with Waiting Repair Times." Applied Mathematics and Computation, Vol. 214, No. 1, PP. 133–141.

12-   Kumar, A. and Agarwal, M. (1980). "A Review of Standby Redundant Systems." IEEE Transactions on Reliability, Vol. 29, No. 4, PP.290 - 294.

13-   Liu, S., Xu, Z., Chen, G. and Hu, X. (2009). "Dependability Research of Standby System Based on Stochastic Petri Net." International Conference on Networks Security, Wireless Communications and Trusted Computing. Wuhan, Hubei.

14-   Xing, L. and shrestha, A. (2006). "QoS Reliability of Hierarchical Clustered Wireless Sensor Networks." 25th IEEE International Performance, Computing, and Communications Conference. IPCCC 2006 (PP. 6 - 646). Phoenix, AZ: 25th IEEE International Performance, Computing, and Communications Conference, IPCCC 2006..

15-   Xing, L., Tannous, O. and Bechta Dugan, J. (2012). "Reliability Analysis of Nonrepairable Cold-Standby Systems Using Sequential Binary Decision Diagrams." IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, Vol. 42, No. 3, PP. 715 - 726.

16-   Yearout, R. D., Reddy, P. and Grosh, D. L. (1986). "Standby Redundancy in Reliability - A Review. " IEEE Transactions on Reliability, Vol. 35, No. 3, PP. 285 - 292.

17-   Yearout, R., Reddy, P. and Lioyd Grash, D. (1986, August). "Standby Redundancy in Reliability - A Review." IEEE Transactions on Reliability, Vol. 35, No. 3, PP. 285 - 292.

18-   Yuan, J. and Long, Y. (2010). "The Missile Flight Control System Reliability Analysis Based on Hybrid fault Trees." 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR) (PP. 158-160). Wuhan: 2nd International Asia Conference on Informatics in Control, Automation and Robotics.

19-   Zhang, K., Bedette, G. and DeMara, R. (2006). "Triple Modular Redundancy with Standby (TMRSB) Supporting Dynamic Resource Reconfiguration." IEEE Autotestcon (PP. 690 - 696). Anaheim, CA: Autotestcon.

20-   Zhang, Y. L., & Wang, G. J. (2007). "A Deteriorating Cold Standby Repairable System with Priority in Use." European Journal of Operational Research, Vol. 183, No. 1, PP. 278–295.

21-   Zhu, H., Zhou, S., Dugan, J. B. and Sulivan, K. (2001). "A Benchmark for Quantitative Fault Tree Reliability Analysis." Reliability and Maintainability Symposium. Philadelphia, PA.

22-   Wang, C., Xing, L. and Amari, S. V. (2012). "A Fast Approximation Method for Reliability Analysis of Cold-Standby Systems." Reliability Engineering & System Safety, Vol. 106, No. 1, PP. 119–126.

23-   Latif-Shabgahi, GR., Bahar Gogani, M. and Aslansefat, K. (2013). Formulation the Reliability and Availability of Industrial Systems with Cold Backups. 3th International conference on Reliability, Tehran, Iran.

24-   Aslansefat, K. (2014). A Novel Approach for Reliability and Safety Evaluation of Control Systems with Dynamic Fault Tree. MSc. Thsis, Abbaspur Campus, Shahid Beheshti University. Tehran, Iran.

25-   Aslansefat, K. and Latif-Shabgahi, G.R. (2013). A New Method in Drawing Reliability Markov Model of Reconfigurable TMR Systems with Frequency Formulations. 5th Iranian Conference on Electrical & Electronics Engineering (ICEEE), Gonabad, Iran.

26-   Ardakan, M. A. and Hamadani, A. Z. (2014). "Reliability–Redundancy Allocation Problem with Cold-Standby Redundancy Strategy." Simulation Modelling Practice and Theory, Vol. 42, No. 1, PP. 107-118.

27-   Levitin, G., Xing, L. and Dai, Y. (2014). "Cold vs. Hot Standby Mission Operation Cost Minimization for 1-out-of-N Systems." European Journal of Operational Research, Vol. 234, No. 1, PP. 155-162.