Links to some of the papers from the supplementary course bibliography


1. Non-fault-tolerant distributed algorithms
Vector timestamps Strong data consistency conditions Weaker data consistency conditions

2. Fault-tolerant algorithms for a fixed set of processes
Failure detectors Atomic bcast and consensus Commit protocols
Consensus and related problems Clock synchronization Byzantine Quorum Systems
Replicated data management Communication Group communication
Self-stabilization

3. Algorithms for a changing set of processes
Algorithms for a changing set of processes

4. Algorithms for mobile networks
Mobile networks




Vector timestamps

C. J. Fidge. Logical time in ddistributed computing systems. IEEE Computer, 24(8):28-33, August 1991. No link available. Please look for the journal at the LCS reading room, or at the MIT libraries catalog .

Friedemann Mattern. Virtual time and global states of distributed systems. In Michel Cosnard et~al., editors, Parallel and Distributed Algorithms (Proceedings), pages 215--226. North Holland, 1989. (Reprinted in Z. Yang, T.A. Marsland (eds.), ``Global States and Time in Distributed Systems'', IEEE, 1994, pages 12-133.

Strong data consistency

H. Attiya and J. L. Welch. Sequential consistency versus linearizability.
ACM Transactions on Computer Systems, 12(1):91--122, May 1994.

Weaker data consistency

Alan Fekete, David Gupta, Victor Luchangco, Nancy Lynch, and Alex Shvartsman. Eventually-serializable data services.
Theoretical Computer Science, 220(1):113-156, June 1999.

Rivka Ladin, Barbara Liskov, Liuba Shrira, and Sanjay Ghemawat. Providing high availability using lazy replication.
ACM Transactions on Computer Science, 10(4):360--391, 1992.

Failure detectors

Tushar Deepak Chandra and Sam Toueg. Unreliable failure detectors for reliable distributed systems.
Journal of the ACM, 43(2):225--267, March 1996.

Tushar~Deepak Chandra, Vassos Hadzilacos, and Sam Toueg. The weakest failure detector for solving consensus.
Journal of the ACM}, 43(4):685--722, July 1996.

Wei Chen, Sam Toueg, and Marcos~Kawazoe Aguilera. On the quality of service of failure detectors.
International Conference on Dependable Systems and Networks (DSN), 2000.

Atomic bcast and consensus

Vassos Hadzilacos and Sam Toueg. A modular approach to fault-tolerant broadcasts and related problems.
Technical Report TR94-1425. Department of Computer Science, Cornell University, Ithaca NY. May 1994.

Vassos Hadzilacos and Sam Toueg. Fault-tolerant broadcasts and related problems. In Sape Mullender, editor, Distributed Systems, chapter~5, pages 97--145. ACM Press and Addison-Wesley, 1993. Second Edition. No link available. Look for it at the LCS reading room, or at the MIT libraries catalog .

Commit protocols

D. Skeen and M. Stonebraker. A formal model of crash recovery in a distributed system. IEEE Transactions on Software Engineering, SE-9 NO.3, May 1983.No link available. Look for it at the LCS reading room, or at the MIT libraries catalog .

I. Keidar and D. Dolev. Efficient message ordering in dynamic networks.
15th ACM Symposium on Principles of Distributed Computing (PODC), pages 68--76, May 1996.

R. Guerraoui. Revisiting the relationship between non-blocking atomic commitment and consensus.
9th International Workshop on Distributed Algorithms (WDAG), volume 972 of Lecture Notes in Computer Science, pages 87--100. Springer Verlag, September 1995.

Consensus and related problems

Idit Keidar and Sergio Rajsbaum. On the cost of fault-tolerant consensus when there are no faults -a tutorial.
Technical Report MIT-LCS-TR-821, May 2001.

L. Lamport. The part-time parliament.
ACM Transactions on Computer Systems, 16(2):133--169, May 1998.

Butler W. Lampson. The ABCD's of Paxos In Twentieth ACM Symposium on Principles of Distributed Computing (PODC'01), Newport, RI, August 2001.

Clock synchronization

No Links available for the these references. Please look for them at the LCS reading room, or at the MIT libraries catalog .

Hagit Attiya and Jennifer Welch. Distributed Computing: Fundamentals, Simulations and Advanced Topics McGraw-Hill, 1998.

Leslie Lamport and P. M. Melliar-Smith. Byzantine clock synchronization. In Proceedings of the Third Annual ACM Symposium on Principles of Distributed Computing, pages 68--74, Vancouver, B.C., Canada, 1984.

Joseph Y. Halpern, Barbara Simons, H. Raymond Strong, and Danny Dolev. Fault-tolerant clock synchronization. In Proceedings of the Third Annual ACM Symposium on Principles of Distributed Computing, pages 89--102, Vancouver, B.C., Canada, 1984.

Michael J. Fischer, Nancy A. Lynch, and Michael Merritt. Easy impossibility proofs for distributed consensus problems. Distributed Computing, 1(1):26--39, January 1986.

Byzantine Quorum Systems

Dahlia Malkhi and Michael Reiter. Byzantine quorum systems.
Journal of Distributed Computing, 11(4):203--213, 1998.

Dahlia Malkhi, Michael Reiter, and Rebecca Wright. Probabilistic quorum systems.
Proceeding of the 16th Annual ACM Symposium on the Principles of Distributed Computing (PODC 97), pages 267--273, Santa Barbara, CA, August 1997.

Lorenzo Alvisi, Dahlia Malkhi, Evelyn Pierce, Michael Reiter, and Rebecca Wright. Dynamic Byzantine Quorum Systems
International Conference on Dependable Systems and Networks (DSN, FTCS-30 and DCCA-8), New York, 2000.

Replicated data management

No Links available for the next two references. Please look for them at the LCS reading room, or at the MIT libraries catalog .

Leslie Lamport. Using time instead of timeout for fault-tolerant distributed systems. ACM Transactions on Programming Languages and Systems, 6(2):254--280, April 1984.

F. B. Schneider. Replication management using the state-machine approach. In Sape Mullender, editor, Distributed Systems}, chapter~7, pages 169--197. ACM Press and Addison-Wesley, 1993. Second Edition.

Miguel Castro and Barbara Liskov. Practical Byzantine fault tolerance.
Proceedings of the Third Symposium on Operating Systems Design and Implementation, pages 173--186, New Orleans, Lousianna, USA, February 1999. USENIX Association.

G. Chokler, D. Malkhi, and M. Reiter. Backoff protocols for distributed mutual exclusion and ordering.
21 IEEE International Conference on Distributed Computing Systems (ICDCS, April 2001.

Communication

Mahesh Jayaram and George Varghese. Crash failures can drive protocols to arbitrary states.
Proceedings of the Fifteenth Annual ACM Symposium on Principles of Distributed Computing, pages 247--256, Philadelphia, PA, May 1996.

Kenneth Birman, Mark Hayden, Oznur Ozkasap, Zhen Xiao, Mihai Budiu, and Yaron Minsky. Bimodal multicast.
ACM Transactions on Computer Systems, 17(2), May 1999.

Ziv Bar-Joseph, Idit Keidar, Tal Anker, and Nancy Lynch. QoS Preserving Totally Ordered Multicast.
5th International Conference On Principles Of Distributed Systems (OPODIS), pages 143--162, Paris, France, December 2000.

Ziv Bar-Joseph, Idit Keidar, and Nancy Lynch. A fault-tolerant dynamic atomic broadcast algorithm with qos guarantees.
MIT Technical Technical Report, December 2000. Work in progress.

Group Communication

No Links available for the next two references. Please look for them at the LCS reading room, or at the MIT libraries catalog .

Kenneth P. Birman. Building Secure and Reliable Network Applications. Manning Publications, Greenwich, CT, 1996.

K. Birman and T. Joseph. Exploiting virtual synchrony in distributed systems. In 11th Symposium on Operating Systems Principles (SOSP), pages 123--138. ACM, November 1987.

D. Dolev and D. Malki. Transis approach to high availability cluster communications.
Communications of the ACM, 39(4):64--70, 1996.

Y. Amir, L. E. Moser, P. M. Melliar-Smith, D. A. Agarwal, and P. Ciarfella. The Totem single-ring ordering and membership protocol.
ACM Transactions on Computer Systems, 13(4), November 1995.

Flaviu Cristian and Frank Schmuck. Agreeing on processor group membership in asynchronous distributed systems.
Technical Report CSE95-428, University of California-San Diego, La Jolla, CA 92093-0114, 1995.

Y. Amir, G. V. Chockler, D. Dolev, and R. Vitenberg. Efficient state transfer in partitionable environments.
2nd European Research Seminar on Advances in Distributed Systems (ERSADS'97), pages 183--192. BROADCAST (ESPRIT WG 22455), Operating Systems Laboratory, Swiss Federal Institute of Technology, Lausanne, March 1997.

Alan Fekete, Nancy Lynch, and Alex Shvartsman. Specifying and using a partitionable group communication service.
ACM Transactions on Computer Systems, 19(2):171--216, May 2001.

Idit Keidar and Roger Khazan. client-server approach to virtually synchronous group multicast: Specifications and algorithms.
IEEE 20th International Conference on Distributed Computing Systems (ICDCS), pages 344--355, Taipei, Taiwan, April 2000.

R. Vitenberg, I. Keidar, G. V. Chockler, and D. Dolev. Group Communication System Specifications: A Comprehensive Study.
Technical report, Institute of Computer Science, The Hebrew University of Jerusalem, Jerusalem, Israel, 1999.

Self-stabilization

No links available for these two. Please look for them at the LCS reading room, or at the MIT libraries catalog .

Edger W. Dijkstra. Self-stabilizing systems in spite of distributed control. Communication of the ACM}, 17(11):643--644, November 1974.

Shlomi Dolev. Self-Stabilization. MIT Press, 2000.

Algorithms for a changing set of processes

M. Merritt and G. Taubenfeld. Computing with infinitely many processes: under assumptions on concurrency and participation.Proceedings of the 14th International Symposium on Distributed Computing (DISC 2000)}, Toledo, Spain, October 2000. Version May 25, 2001.

E. Yeger Lotem, I. Keidar, and Dolev D. Dynamic voting for consistent primary components.
Proceedings of the Sixteenth Annual ACM Symposium on Principles of Distributed Computing, pages 63--71, Santa Barbara, CA, August 1997.

R. De Prisco, A. Fekete, N. Lynch, and A. Shvartsman. A dynamic view-oriented group communication service.
17th ACM Symposium on Principles of Distributed Computing (PODC), pages 227--236, June 1998.

R. De Prisco, A. Fekete, N. Lynch, and A. Shvartsman. A dynamic primary configuration group communication service. In 13th International Symposium on DIStributed Computing (DISC)}, Lecture Notes in Computer Science, Bratislava, Slovak Republic, 1999. Springer-Verlag-Heidelberg.

Roberto DePrisco. On Building Blocks for Distributed Systems. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, December 1999.

Roberto DePrisco and Nancy Lynch. An algorithm for replicated atomic objects.Preprint.

Frank Dabek, Emma Brunskill, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, and Hari Balakrishnan. Balakrishnan. Building peer-to-peer systems with chord, a distributed lookup service.
Proceedings of the 8th Workshop on Hot Topics in Operating Systems (HotOS-VIII), Schloss Elmau, Germany, May 2001.

Ion Stoica, Robert Morris, David Karger, M.~Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications.
Proceedings of the ACM SIGCOMM '01 Conference, San Diego, California, August 2001.

Algorithms for mobile networks

Jinyang Li, John Jannotti, Douglas S. J. De Couto, David R. Karger, and Robert Morris. A scalable location service for geographic ad hoc routing.
ACM Mobicom 2000, Boston, MA, 2000.

N. Malpani, N. Vaidya, and J. L. Welch. Distributed token circulation in mobile ad hoc networks. Proc. 9th International Conference on Network Protocols (ICNP), November 2001. To appear. No link available yet.

Jennifer Walter, Jennifer L. Welch, and Nitin Vaidya. A mutual exclusion algorithm for ad hoc mobile networks.
accepted to Wireless Networks.

Navneet Malpani, Jennifer L. Welch, and Nitin Vaidya. Leader election algorithms for mobile ad hoc networks.
Fourth International Workshop on Discrete Algorithms and Methods forMobile Computing and Communications (DIAL-M), Boston, MA, August 2000.

S. Biaz and J. L. Welch. Closed Form Bounds for Clock Synchronization Under Simple Uncertainty Assumptions. To appear in Information Processing Letters.