Consensus
- A procedure to reach in a common agreement in a distributed or decentralized multi- agent platform
- Important for message passing system
Why Consensus
- Reliability and fault tolerance in a distributed system
- Example
→ State machine replication
→ Clock synchronization
Why Consensus can be difficult in certain scenarios
- Consider a message passing system, and a general behaves maliciously
Distributed Consensus
If there is no failure, it is easy and trivial to reach in a consensus
- Broadcast the personal choice to all
- Apply a choice function, say the maximum of all the values
- There can be various types of faults in a distributed system.
- Crash fault : A node suddenly crashes or becomes unavailable in the middle of a communication
- Network or Partitioned Faults : A network fault occurs (say the link failure ) and the network gets partitioned.
- Byzantine Fault : A node starts behaving maliciously
Distributed Consensus - Properties
- Termination : Every correct individual decides some value at the end of the consensus protocol
- Validity : If all the individuals proposes the same value, then all correct individuals decide on that value
- Integrity : Every correct individual decides at most one value , and the decide value must be proposed by some individuals
- Agreement : Every correct individual must agree on the same value
Synchronous vs Asynchronous Systems
- Synchronous message passing system : The message must be received within a predefined time interval
→ Strong guarantee on message transmission delay
- Asynchronous message passing system : There is no upper bound on the message transmission delay or the message reception time
→ No timing constraint, message can be delayed for arbitrary period of times
Asynchronous Consensus
FLP85 (Impossibility Result) : In s purely asynchronous distributed system, the consensus problem is impossible ( with a deterministic solution) to solve if in the presence of a single crash failure.
- Results by Fischer, Lynch and Patterson (most influential paper awarded in ACM PODC 2001)
- Randomized algorithm may exist
Various consensus algorithms has been explored by the distributed system community
- Paxos
- Raft
- Byzantine fault tolerance (BFT)
We'll look into these consensus algorithms, but later !!
Correctness of a Distributed Consensus Protocol
- Safety : Correct individuals must not agree on an incorrect value
→ Nothing bad happend
- Livelines (or Liveness) : Every correct value must be accepted eventually
→ Something good eventually happens
Consensus in an open system
The tradition distributed protocols are based on
- Message passing (when individuals are connected over the internet)
- Shared memory (when a common memory place is available to read and write the shared variables the everyone can access)
Message passing requires a closed environment - every need to know the identity of others
Shared memory is not suitable for internet grade computing
- Where do you put the shared memory ?
Bitcoin is an open environment
- Anyone can join in the Bitcoin network anytime
- How do you ensure consensus in such an open system ? - A key challenge.
No comments:
Post a Comment