10-02-2012, 04:13 PM
Clock Synchronization
Lec11.ppt (Size: 94.5 KB / Downloads: 108)
Global State
.Global state of a distributed system
Local state of each process
Messages sent but not received (state of the queues)
.Many applications need to know the state of the system
Failure recovery, distributed deadlock detection
.Problem: how can you figure out the state of a distributed system?
Each process is independent
No global clock or synchronization
.Distributed snapshot: a consistent global state
Distributed Snapshot Algorithm
Assume each process communicates with another process using unidirectional point-to-point channels (e.g, TCP connections)
Any process can initiate the algorithm
Checkpoint local state
Send marker on every outgoing channel
On receiving a marker
Checkpoint state if first marker and send marker on outgoing channels, save messages on all other channels until:
Subsequent marker on a channel: stop saving state for that channel