26-12-2012, 02:21 PM
Peer-to-Peer Simulators
1Peer-to-Peer.pdf (Size: 2.55 MB / Downloads: 38)
Abstract
In this technical report, we present our findings on network simulators, which can be
used to help test and design a P2P system. The Portal-based P2P system under
development uses portals to provide a user interface and P2P volunteers to provide
resources to the network. The aim is to create an infrastructure that can be used by the
scientific community, based on existing social networks, to help download and
maintain large datasets. Network simulators provide a virtual network environment,
which can potentially provide assurance for software developers, when making design
decisions. This report outlines the architecture of the Portal-based P2P system and
defines a set of criteria for a suitable simulator, to aid the development of the P2P
system. A number of network simulators are reviewed and briefly tested for their
suitability based on criteria. The experience gained from reviewing network
simulators is presented in this report.
Introduction
In recent years, Peer-to-Peer (P2P) technologies have become increasingly popular. A
P2P system can be defined as a distributed network architecture, whereby participants
share a part of their own hardware resources, such as processing power, storage
capacity, or network bandwidth. The shared resources are necessary to provide the
service and content offered by the network, such as file-sharing. The service or
content provided by the P2P network is accessible by other peers directly, without
passing intermediary entities [1].
P2P Systems
P2P systems are commonly used to distribute content as part of a Content Distribution
Network (CDN). One of the main reasons for the success of P2P networks is
providing users with the ability to distribute content by contributing bandwidth back
to the network. This concept is the social aspect of a P2P network, whereby users
freely give bandwidth to other users. Most, if not all, members of a P2P network
contribute resources to the network; essentially users are helping each other download
files. Popular examples of P2P networks/applications used are Gnutella [1],
BitTorrent [3], Kazaa [4], eDonkey [5], Limewire [6] and Freenet [7].
P2P technologies are mainly used for file sharing and are becoming the normal
method for the distribution of Linux operating systems. The P2P method of file
distribution is superseding the older mechanisms, such as the File Transfer Protocol
(FTP). When using traditional FTP servers, the bandwidth for all users downloading
from the server is split between the numbers of users. This can lead to a server’s
bandwidth being completely depleted, leaving users with a less than optimal transfer
rate. The centralised architecture of the traditional FTP server is known not to scale
well. However, P2P systems such as BitTorrent have been seen to scale to over
100,000 users for a single file/data set. P2P applications like BitTorrent, which have
made efficient use of bandwidth have forced Internet Service Providers (ISP) to
reassess their price plans [8], due to the increased flow of traffic from broadband
users. Even though broadband users do not have much bandwidth, when compared to
enterprise Internet connections, they still manage to accumulate enough bandwidth to
transfer many Gbytes of data. A study monitored over 90 thousand BitTorrent users,
downloading a file of approximately 2Gbytes, over a period of 8 months. During June
2004, BitTorrent was 53 per cent of all P2P traffic on the Internet backbone [9].
Portal-based P2P System
The aim of the Portal-based P2P system is to allow users to store large datasets across
many peers, instead of a dataset being hosted centrally. Each peer in the network
contributes resources back to the network. Resources contributed, are network
bandwidth and file storage space. Portals within the P2P system provide users a way
to manage the dataset hosted by the P2P network. Our Portal-based P2P system is
designed to manage and update large scientific datasets. Projects such as the Sloan
Digital Sky Survey (SDSS) [10] are set to produce over 15 TBytes of data by the end of the project lifetime [11]. Such projects have datasets, which are growing, and
subsequently the costs and logistics of transferring the datasets is becoming
increasingly difficult.
Network Simulators
Network simulators are typically used to simulate network communications in
particular scenarios or situations, without configuring ‘real’ machines or networks.
Simulators can help with the development and testing of a network application. There
are two main types of network simulators, packet-based and flow-based. Packet-based
network simulators, attempt to simulate data packets, such as NS-2 [13]. Other
simulators are flow-based, and work at the application level, which means they
disregard parts of the TCP/network stack. Several flow-based simulators, such as GPS
[14] have a mechanism to introduce packet delay, to provide realistic communication
characteristics, while others do not. Typically, packet-based simulators take longer to
complete a simulation than flow-based simulators. This is because of the calculations
made for each packet in the simulated network. Network simulators normally allow a
developer to produce a network topology and define delay, bandwidth and
connection/traffic characteristics for the nodes and links.
Review
Simulators can be classified into two categories, packet-based and application-level.
Packet-based simulators calculate delay, bandwidth and routing for each packet
generated or used by the simulation. Application-level simulators do not account for
each packet instead they calculate bandwidth and delay to/from network end-points.
Application-level simulators usually use the terms ‘flow-based’ or ‘message-based’ to
describe how they evaluate communications between nodes in the simulation.
PeerSim
PeerSim is a Peer-to-Peer simulator, which has been developed with scalability in
mind. PeerSim was developed for the BISON project [27] and is used by the DELIS
project [28]. PeerSim has cycle-based and event-based simulator engines. The cyclebased
engine is simplified by ignoring the transport layer in the protocol stack and
lacks support for concurrency. The lack of concurrency support means that the
simulator sequentially gives control to each node in turn. The cycle-based engine
allows the simulator to scale up to a larger number of nodes (one million nodes), but
this does mean some accuracy is lost, in comparison with PeerSim’s event-based
engine. The accuracy of the cycle-based engine is acceptable for smaller simulations,
however it is not known how much the error rate will increase with larger simulations.
PeerSim has some predefined protocols for P2P simulations, namely OverSat [29]
[30], SG-1 [31] and T-Man [32]. Two-tier hierarchical models should be possible as
the SG-1 protocol is based around a super-peer topology. PeerSim is written in Java,
therefore protocols and simulator components are developed in Java. There are no
visualisations of a simulation with PeerSim. PeerSim has some documentation and
mailing lists, however the mailing lists are not used for support, but instead for release
announcements and change logs. Documentation is provided in the form of Javadocs,
tutorials and a protocol implementation guide. Publications using PeerSim and the
three P2P protocols, namely SG-1, OverSat and T-Man are listed on PeerSim’s
website.
Conclusion
Although network simulators can be used for P2P systems, their usefulness for our
Portal-based P2P system is not clear. All of the simulators, except AgentJ and
PlanetSim, require a developer to implement a P2P system specifically for a particular
simulator. Thereby increasing the development time for application developers.
However, simulators such as AgentJ and PlanetSim can be used once a suitable stage
has been reached in the development cycle of the P2P system. Practical issues and
limitations arise, as AgentJ is a complex simulation environment and the
documentation states TCP is currently not supported. PlanetSim provides a very
similar environment, but does not provide any default means to gather results. None
of the simulators completely fulfils our requirements, however AgentJ and PlanetSim
are the best candidates.