02-11-2012, 12:25 PM
Introduction to Distributed Systems and Networking
Distributed Systems.pdf (Size: 679.53 KB / Downloads: 26)
Centralized vs Distributed Systems
• Centralized System: System in which major functions
are performed by a single physical computer
–Originally, everything on single computer
–Later: client/server model
• Distributed System: physically separate computers
working together on some task
–Early model: multiple servers working together
• Probably in the same room or building
• Often called a “cluster”
–Later models: peer-to-peer/wide-spread collaboration
Distributed Systems
Definition:
Loosely coupled processors interconnected by network
• Distributed system is a piece of software that ensures:
– Independent computers appear as a single coherent
system
• Lamport: “A distributed system is a system where I
can’t get my work done because a computer that I’ve
never heard of has failed”
Why use distributed systems?
• These are now a requirement:
– Economics dictate that we buy small computers
– Cheap way to provide reliability
– We all need to communicate
– It is much easier to share resources
– Allows a whole set of distributed applications
– A whole set of future problems need machine
communication
Distributed Systems: Issues
• The promise of distributed systems:
–Higher availability: one machine goes down, use another
–Better durability: store data in multiple locations
–More security: each piece easier to make secure
• Reality has been disappointing
–Worse availability: depend on every machine being up
• Lamport: “a distributed system is one where I can’t do work because some machine I’ve
never heard of isn’t working!”
–Worse reliability: can lose data if any machine crashes
–Worse security: anyone in world can break into system
• Coordination is more difficult
–Must coordinate multiple copies of shared state information
(using only a network)
–What would be easy in a centralized system becomes a lot
more difficult
Distributed Systems Goals
• Connecting resources and users
• Transparency: the ability of the system to mask its
complexity behind a simple interface
– Location: Can’t tell where resources are located
– Migration: Resources may move without the user knowing
– Replication: Can’t tell how many copies of resource exist
– Concurrency: Can’t tell how many users there are
– Parallelism: System may speed up large jobs by splitting
them into smaller pieces
– Fault Tolerance: System may hide various things that go
wrong in the system
• Openness: portability, interoperability
• Scalability: size, geography, administrative
• Transparency and collaboration require some way for
different processors to communicate with one
another
Internet: Packets
• The network transports bytes grouped into
packets
• Packets are “self-contained”; routers handle
them 1 by 1
• The end hosts worry about errors and pacing
– Destination sends ACKs; Source checks losses
Internet: Points to remember
• Separation of tasks
– send bits on a link: transmitter/receiver [clock,
modulation,…]
– send packet on each hop [framing, error detection,…]
– send packet end to end [addressing, routing]
– pace transmissions [detect congestion]
– retransmit erroneous or missing packets [acks, timeout]
– find destination address from name [DNS]
• Scalability
– routers don’t know full path
– names and addresses are hierarchical
Protocol
• Two communicating entities must agree on:
– Expected order and meaning of messages they
exchange
– The action to perform on sending/receiving a
message
• Asking the time
Summary
• Network: physical connection that allows two
computers to communicate
– Packet: unit of transfer, sequence of bits carried over
the network
• Protocol: Agreement between two parties as to
how information is to be transmitted
• Internet Protocol (IP)
– Used to route messages through routes across globe
– 32-bit addresses, 16-bit ports
• Reliable, Ordered, Arbitrary-sized Messaging:
– Built through protocol layering on top of unreliable,