×

The doomsday distributed termination detection protocol. (English) Zbl 1266.68057

Summary: Distributed termination detection (DTD) algorithms are important since they detect globally stable states in distributed computations. Here we introduce a new DTD mechanism, the Doomsday protocol together with its proof of correctness. Doomsday is generic since it forms the basis for a number of new and existing DTD algorithms for which the correctness proof may be reused. The paper describes the Doomsday protocol, provides its formal proof, derives one new DTD algorithm and shows how other hitherto unrelated algorithms, Dijkstra-Scholten, Task Balancing and Credit Recovery, can be derived from the protocol. The paper concludes by examining various properties of the protocol in the context of existing DTD algorithms.

MSC:

68M14 Distributed systems
68M10 Network design and communication in computer systems

Software:

GC
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Birman K.P., Schiper A., Stephenson P. (1991) Lightweigt causal and atomic group multicast. ACM Trans. Comput. Syst. 9(3): 272–314 · doi:10.1145/128738.128742
[2] Black D.L. (1986) On the existence of delay-insensitive fair arbiters: Trace theory and its limitations. Distrib. Comput. 1(4): 205–225 · Zbl 0643.94041 · doi:10.1007/BF01660033
[3] Blackburn, S.M., Moss, J.E.B., Hudson, R.L., Morrison, R., Munro, D.S., Zigman, J.N.: Starting with termination: A methodology for building distributed garbage collection algorithms. In: 24th Australasian Computer Science Conference (ACSC 2001), 29 January–1 February 2001, Gold Coast, Queensland, Australia, pp. 20–28. IEEE Computer Society (2001)
[4] Chandy K.M., Lamport L. (1985) Distributed snapshots: Determining global states of distributed systems. ACM Trans. Comput. Syst. 3(1): 63–75 · doi:10.1145/214451.214456
[5] Chandy K.M., Misra J., Haas L.M. (1983) Distributed deadlock detection. ACM Trans. Comput. Syst. 1(2): 144–156 · doi:10.1145/357360.357365
[6] Dijkstra E.W., Scholten C.S. (1980) Termination detection for diffusing computations. Inf. Process. Lett. 11(1): 1–4 · Zbl 0439.68039 · doi:10.1016/0020-0190(80)90021-6
[7] Fischer M.J., Lynch N.A., Paterson M. (1985) Impossibility of distributed consensus with one faulty process. J. ACM 32(2): 374–382 · Zbl 0629.68027 · doi:10.1145/3149.214121
[8] Hudson, R.L., Morrison, R., Moss, J.E.B., Munro, D.S.: Garbage collecting the world: One car at a time. In: ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA ’97), Atlanta, Georgia, pp.162–175 (1997)
[9] Hudson, R.L., Morrison, R., Moss, J.E.B., Munro, D.S.: Where have all the pointers gone? In: 21st Australasian Computer Science Conference, Perth, Western Australia, pp. 107–119 (1998)
[10] Jones, R.: Garbage Collection: Algorithms for Automatic Dynamic Memory Management. John Wiley and Sons (1996). With a chapter on Distributed Garbage Collection by Rafael Lins. Reprinted 1997 (twice), 1999, 2000.
[11] Lai T.H., Wu L.F. (1995) An (n-1)-resilient algorithm for distributed termination detection. IEEE Trans. Parallel Distrib. Syst. 6(1):63–78 · doi:10.1109/71.363410
[12] Lamport, L.: ”sometime” is sometimes ”not never” - on the temporal logic of programs. In: 7th ACM Symposium on the Principles of Programming Languages, (PoPL7), Las Vegas, Nevada, pp. 174–185 (1980)
[13] Livesey M.J. (1993) A note on consistency in asynchronous multicaches. Distrib. Comput. 7(2): 111–114 · Zbl 1282.68078 · doi:10.1007/BF02280840
[14] Matocha J., Camp T. (1998) A taxonomy of distributed termination detection algorithms. J. Syst. Softw. 43(3): 207–221 · Zbl 05433384 · doi:10.1016/S0164-1212(98)10034-1
[15] Mattern F. (1989) Global quiescence detection based on credit distribution and recovery. Inf. Process. Lett. 30(4):195–200 · doi:10.1016/0020-0190(89)90212-3
[16] Norcross S., Morrison R., Munro D.S., Detmold H., Falkner K.E. (2005) Implementing a family of distributed garbage collectors. J. Res. Pract. Inf. Technol. 37(1): 107–126
[17] Peterson L.L., Buchholz N.C., Schlichting R.D. (1989) Preserving and using context information in interprocess communication. ACM Trans. Comput. Syst. 7(3): 217–246 · doi:10.1145/65000.65001
[18] Plainfossé, D., Shapiro, M.: A survey of distributed garbage collection techniques. In: Baker H.G. (ed.) IWMM, Lecture Notes in Computer Science, vol. 986, pp. 211–249. Springer, Berlin Heidelberg New York (1995)
[19] Tel, G.: Introduction to distributed algorithms. Cambridge University Press, New York, NY, USA (1994) · Zbl 0826.68056
[20] Tel G., Mattern F. (1993) The derivation of distributed termination detection algorithms from garbage collection schemes. ACM Trans. Program. Lang. Syst. 15(1): 1–35 · doi:10.1145/151646.151647
[21] Wu, L.F., Lai, T.H., Tseng, Y.C.: Consensus and termination detection in the presence of faulty processes. In: 1st International Conference on Parallel and Distributed Systems, Taiwan, pp. 267–274 (1992)
[22] Ye, X., Keane, J.A.: Detecting distributed termination in the presence of node failure. In: ACSC ’95: Proceedings of the 1995 Asian Computing Science Conference on Algorithms, Concurrency and Knowledge, pp. 195–209. Springer, Berlin Heidelberg New York (1995)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.