Abstract
This work applies the theory of knowledge in distributed systems to the design of efficient fault-tolerant protocols. We define a large class of problems requiring coordinated, simultaneous action in synchronous systems, and give a method of transforming specifications of such problems into protocols that are optimal in all runs: these protocols are guaranteed to perform the simultaneous actions as soon as any other protocol could possibly perform them, given the input to the system and faulty processor behavior. This transformation is performed in two steps. In the first step we extract, directly from the problem specification, a high-level protocol programmed using explicit tests for common knowledge. In the second step we carefully analyze when facts become common knowledge, thereby providing a method of efficiently implementing these protocols in many variants of the omissions failure model. In the generalized omissions model, however, our analysis shows that testing for common knowledge is NP-hard. Given the close correspondence between common knowledge and simultaneous actions, we are able to show that no optimal protocol for any such problem can be computationally efficient in this model. The analysis in this paper exposes many subtle differences between the failure models, including the precise point at which this gap in complexity occurs.
Original language | English |
---|---|
Pages (from-to) | 121-169 |
Number of pages | 49 |
Journal | Algorithmica |
Volume | 3 |
Issue number | 1-4 |
DOIs | |
State | Published - Nov 1988 |
Externally published | Yes |
Keywords
- Byzantine agreement
- Common knowledge
- Distributed firing squad
- Omissions failure model
- Simultaneous action
ASJC Scopus subject areas
- General Computer Science
- Computer Science Applications
- Applied Mathematics