Abstract
Queueing models provide a useful tool for predicting the performance of many service systems including computer systems, telecommunication systems, computer/communication networks and flexible manufacturing systems. Traditional queueing models predict system performance under the assumption that all service facilities provide failure-free service. It must, however, be acknowledged that service facilities do experience failures and that they get repaired. In recent years, it has been increasingly recognized that this separation of performance and reliability/availability models is no longer adequate.
An exact steady-state queueing analysis of such systems is considered by several authors and is carried out by means of generating functions, supplementary variables, imbedded Markov process and renewal theory, or probabilistic techniques [1,2,7,8]. Another approach is approximate, in which it is assumed that the time to reach the steady-state is much smaller than the times to failures/repairs. Therefore, it is reasonable to associate a performance measure (reward) with each state of the underlying Markov (or semi-Markov) model describing the failure/repair behavior of the system. Each of these performance measures is obtained from the steady-state queueing analysis of the system in the corresponding state [3,5].
Earlier we have developed models to derive the distribution of job completion time in a failure-prone environment [3,4]. In these models, we need to consider a possible loss of work due to the occurrence of a failure, i.e., the interrupted job may be resumed or restarted upon service resumption. Note that the job completion time analysis includes the delays due to failures and repairs. The purpose of this paper [9] is to extend our earlier analysis so as to account for the queueing delays. In effect, we consider an exact queueing analysis of fault-tolerant systems in order to obtain the steady-state distribution and the mean of the number of jobs in the system. In particular, we study a system in which jobs arrive in a Poisson fashion and are serviced according to FCFS discipline. The service requirements of the incoming jobs form a sequence of independent and identically distributed random variables. The failure/repair behaviour of the system is modelled by an irreducible continuous-time Markov chain, which is independent of the number of jobs in the system. Let the state-space be {1,2, …,n}. When the computer system is in state i it delivers service at rate ri ≥ 0. Furthermore, depending on the type of the state, the work done on the job is preserved or lost upon entering that state. The actual time required to complete a job depends in a complex way upon the service requirement of the job and the evolution of the state of the system. Note that even though the service requirements of jobs are independent and identically distributed, the actual times required to complete these jobs are neither independent nor identically distributed, and hence the model cannot be reduced to a standard M/G/1 queue [8]. As loss of work due to failures and interruptions is quite a common phenomenon in fault-tolerant computer systems, the model proposed here is of obvious interest.
Using our earlier results on the distribution of job completion time we set up a queueing model and show that it has the block M/G/1 structure. Queueing models with such a structure have been studied by Neuts, Lucantoni and others [6]. We demonstrate the usefulness of our approach by performing the numerical analysis for a system with two processors subject to failures and repairs.
- 1 Baceelli, F. and Trivedi, K. S., "Analysis of M/G/2-Standby Redundant System," Performance '88, A. K. Agrawala and S. k. Tripathi (eds.), North Holland, 1983, pp. 457-476. Google Scholar
Digital Library
- 2 Gaver D. P., "A Waiting Line with Interrupted Service, including Priorities," .7. Royal Statistical Society, Series B-24, 1982, pp. 73-90.Google Scholar
- 3 Kulkarni, V. G., Nicola, V. F., Trivedi, K. S. and Smith, R. M., "A Unified Model for the Analysis of Job Completion Time and Performability Measures in Fault-Tolerant Systems," Tech. Report CS-1985-13, Dept. of Computer Science, Duke University, N. C., 1985. Google Scholar
Digital Library
- 4 Kulkarni, V. G., Nicola, V. F. and Trivedi, K. S., "The Completion Time of a Job on Multi-Mode Systems," in review, jr. of Applied Probability.Google Scholar
- 5 Meyer, J., "Closed-Form Solution of Performability," {FEE Trans. on Computer#, 3/ol. C-31, No. 7, 1982, pp. 648-657.Google Scholar
- 6 Neuts, M. F., 'CMatrix Analytic Methods in Queueing Theory," Eeropean Y. of Operationa Re, earth, Vol. 15, 1984, pp. 2-12.Google Scholar
- 7 Neut,, M.F. and Lucantoni, D.M., "A Markovian Queue with N Servers Subject to Breakdowns and Repairs," Manafement Science, Vol. 25, No. 9, 1979, pp. 849-861.Google Scholar
Digital Library
- 8 Nicola, V. F., "A Sing}e Server Queue with Mixed Types of Interruptions," to appear in Aeta Informatiea, 1986. Google Scholar
Digital Library
- 9 Nico}a, V. F., Kulkarni, V. G. and Trivedi, K. S., "Queueing Analysis of Fault-Tolerant Systems," accepted in Performance '86, to appear in IEEE Trans. on Software Engineering, 1986.Google Scholar
Index Terms
Queueing analysis of fault-tolerant computer systems (extended abstract)
Recommendations
Queueing analysis of fault-tolerant computer systems (extended abstract)
SIGMETRICS '86/PERFORMANCE '86: Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluationQueueing models provide a useful tool for predicting the performance of many service systems including computer systems, telecommunication systems, computer/communication networks and flexible manufacturing systems. Traditional queueing models predict ...
Queueing Analysis of Fault-Tolerant Computer Systems
In this paper we consider the queueing analysis of a fault-tolerant computer system. The failure/repair behavior of the server is modeled by an irreducible continuous-time Markov chain. Jobs arrive in a Poisson fashion to the system and are serviced ...






Comments