Supplementary Reading List

September 8, 2005

1. Other general distributed algorithms textbooks

Hagit Attiya and Jennifer Welch. Distributed Computing: Fundamentals, Simulations, and Advanced Topics. John Wiley and Sons, Inc., 2004. Second Edition.

2. The number of rounds required for consensus

Marcos Kawazoe Aguilera and Sam Toueg. A simple bivalency proof that t-resilient consensus requires t+1 rounds. Information Processing Letters, 71(3-4):155-158, August 1999. .pdf

Idit Keidar and Sergio Rajsbaum. On the cost of fault-tolerant consensus when there are no faults -a tutorial. Technical Report MIT-LCS-TR-821, May 2001. Preliminary version in SIGACT News, 32(2):45-63, Distributed Computing column, June 2001 (published in May 15th). .ps

3. Minimum spanning tree protocol

R. G. Gallager, P. A. Humblet, and P. M. Spira. A distributed algorithm for minimum-weight spanning trees. ACM Trans. Programming Language Syst., vol. 5, pp. 66--77, 1983. .pdf

4. Vector Timestamps

Friedemann Mattern. Virtual time and global states of distributed systems. In Michel Cosnard et al., editors, Parallel and Distributed Algorithms: Proceedings of the International Workshop on Parallel and Distributed Algorithms (Chateau de Bonas, Gers, France, October, 1988), pages 215-226. North Holland, 1989. (Reprinted in: Z. Yang, T.A. Marsland (Eds.), ``Global States and Time in Distributed Systems'', IEEE, 1994, pp. 123-133.) .pdf

Colin Fidge. Logical time in distributed computing systems. IEEE Computer, 24(8):28-33, August 1991. .pdf

5. Impossibility of consensus

Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. Impossibility of Distributed Consensus with One Faulty Process. Journal of the ACM, 32(2):374-382, April 1985. .pdf

Soma Chaudhuri. More Choices Allow More Faults: Set Consensus Problems in Totally Asynchronous Systems. Information and Computation 105 (1), pp. 132-158, July 1993. .pdf

6. Wait-free computability and the wait-free consensus hierarchy

Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1):124-149, January 1991. .pdf

Prasad Jayanti. Robust wait-free hierarchies. Journal of the ACM, 44(4): 592-614, 1997. .pdf

Prasad Jayanti. Wait-free computing. In Jean-Michel Helary, Michel Raynal (Eds.): Distributed Algorithms, 9th International Workshop, WDAG '95, Le Mont-Saint-Michel, France, September 13-15, 1995, (Proceedings), volume 972 of Lecture Notes in Computer Science, Springer 1995. Online version not available.

Wai-Kau Lo and Vassos Hadzilacos. All of us are smarter than any of us: nondeterministic wait-free hierarchies are not robust. SIAM Journal on Computing, 30(3):689-728, 2001. .ps

7. Wait-free vs. f-fault-tolerant data objects

T.D. Chandra, V. Hadzilacos, P. Jayanti and S. Toueg. Wait-freedom vs. t-resiliency and the robustness of the h_m^r hierarchy. Proceedings of the 13th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, pages 334-343, Los Angeles, CA, August 1994. .pdf

T. D. Chandra, V. Hadzilacos, P. Jayanti, and S. Toueg. Generalized irreducibility of consensus and the equivalence of t-resilient and wait-free implementations of consensus. To appear in SIAM Journal on Computing. .ps

Elizabeth Borowsky, Eli Gafni, Nancy Lynch, and Sergio Rajsbaum. The BG distributed simulation algorithm. Distributed Computing, 14:127-146, 2001. .pdf

W.K. Lo and V. Hadzilacos. On the power of shared object types to implement one-resilient consensus. Distributed Computing, 13(4):219-238, 2000. .pdf

Paul Attie, Rachid Guerraoui, Petr Kouznetsov, Nancy Lynch, and Sergio Rajsbaum. Impossibility of boosting distributed service resilience. Technical Report MIT-LCS-TR-982, MIT CSAIL, Cambridge, MA, February 2005. .ps

8. Reliable broadcast

Vassos Hadzilacos and Sam Toueg. Fault-tolerant broadcasts and related problems. In Sape Mullender, editor, Distributed Systems, chapter 5, pages 97-145. ACM Press and Addison-Wesley, 1993. Second Edition.

V. Hadzilacos and S. Toueg. A modular approach to the specification and implementation of fault-tolerant broadcasts. Technical Report TR94-1425, Department of Computer Science, Cornell University, Ithaca NY, May 1994. .ps

9. Failure detectors and consensus

Tushar Deepak Chandra and Sam Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225-267, March 1996. .ps

Tushar Deepak Chandra, Vassos Hadzilacos, and Sam Toueg. The weakest failure detector for solving consensus. Journal of the ACM, 43(4):685-722, July 1996. .pdf

Eli Gafni. A Round-by-Round Failure Detector - Unifying Synchrony and Asynchrony. Proceedings of the 17th Annual ACM Symposium on Principles of Distributed Computing, Puerto Vallarta, Mexico, pages 143-152, June 1998. .pdf

L. Lamport. The part-time parliament. ACM Transactions on Computer Systems, 16(2):133--169, May 1998. Also Research Report 49, Digital Equipment Corporation Systems Research Center, Palo Alto, CA, September 1989. .ps

10. Self-stabilization

Edsger W. Dijkstra. Self-stabilizing systems in spite of distributed control. Communications of the ACM, 17(11):643-644, November 1974. .pdf

Shlomi Dolev. Self-Stabilization. MIT Press, 2000.

11. Clock synchronization

Hagit Attiya and Jennifer Welch. Distributed Computing: Fundamentals, Simulations, and Advanced Topics. John Wiley and Sons, Inc., 2004. Second Edition.

Rui Fan and Nancy Lynch. Gradient Clock Synchronization. Proceedings of the Twenty-third Annual ACM PODC. St. Johns, Newfoundland, Canada, July 2004. .ps

12. Dynamic distributed algorithms

Nancy Lynch and Alex Shvartsman. Rambo: A reconfigurable atomic memory service for dynamic networks. Technical Report MIT-LCS-TR-856, MIT Laboratory for Computer Science, Cambridge, MA, 2002. .ps

Seth Gilbert, Nancy Lynch, and Alex Shvartsman. RAMBO II: Rapidly reconfigurable atomic memory for dynamic networks. Proceedings of the International Conference on Dependable Systems and Networks (DSN), San Francisco, CA, pages 259-268, June 22nd - 25th, 2003. Also, Technical Report MIT-LCS-TR-890, MIT CSAIL, Cambridge, MA, 2004. .ps

Shlomi Dolev, Seth Gilbert, Nancy Lynch, Alex Shvartsman, and Jennifer Welch. GeoQuorums: Implementing atomic memory in ad hoc networks. Technical Report MIT-LCS-TR-900a, MIT CSAIL, Cambridge, MA, 2004. .ps

Shlomi Dolev, Seth Gilbert, Nancy A. Lynch, Elad Schiller, Alex A. Shvartsman, and Jennifer L. Welch. Virtual Mobile Nodes for Mobile Ad hoc Networks. Proceedings of the 18th International Conference on Distributed Computing (DISC), October, 2004. .ps

Shlomi Dolev, Seth Gilbert, Limor Lahiani, Nancy Lynch, and Tina Nolte. Timed Virtual Stationary Automata for Mobile Networks. Technical Report MIT-LCS-TR-979a, MIT CSAIL, Cambridge, MA, August 2005. .ps

James Aspnes, Dana Angluin, Zoe Diamadi, Michael J. Fischer, and Rene Peralta. Computation in networks of passively mobile finite-state sensors. Proceedings of the Twenty-Third ACM Symposium on Principles of Distributed Computing, July 2004, pp. 290-299. Accepted to Distributed Computing (PODC 2004 special issue), September 2004. Last revised May 2005. .pdf