21-08-2013, 02:29 PM
PhoenixSim 1.0 User Manual
PhoenixSim.pdf (Size: 4.94 MB / Downloads: 91)
Overview
This document is used to completely describe PhoenixSim to a person wishing to use or modify it.
Background
The concept of a NoC has brought about a change in the way microprocessors are being designed
today. Exploration of the architecture, implementation, and impact of NoCs in future generations
of processors is paramount to being able to scale performance while maintaining (or achieving)
power efficiency. However, actual implementations have generally lagged behind the multitude
of proposed designs. Leading instantiations of today’s NoCs include Tilera’s TILE and TILE-Gx
chips [21] (a 10× 10 mesh) and the Cell Processor’s Element Interconnect Bus [10] (a centrally-
arbitrated circuit-switched 12-port redundant ring). These would be considered ”baseline” imple-
mentations in today’s NoC research community.
There are many reasons why chips coming off today’s assembly lines don’t have the compli-
cated NoC architectures that have been the subject of many-a-research. The heart of the matter
lies in the fact that most microprocessors, general purpose or otherwise, don’t have or need very
many cores. Programming many-core architectures is a serious problem that has not been suffi-
ciently addressed as of yet, and therefore software generally can’t make use of too many cores. In
addition, memory bandwidth is generally the performance bottleneck anyway.
But fear not, researcher of the network-on-chip: there is hope. The problems just mentioned
will be solved one day, and the fruit of our labor will be required.
Photonics in Networks-on-Chip
Using light as the medium for transmitting data has been around for some time now, gaining wide-
spread acceptance in the long-distance telecommunications industry for a few important reasons:
• Light travels a long way (kilometers) without having to be regenerated, whereas electronics
must drive a capacitive wire every inch of the way. Think of it like this: optics is like shooting
a gun, electronics is like plowing snow. This can be a huge advantage in terms of energy
consumption.
• Light travels at the speed of, well, light. Actually about (2/3)c in optical single-mode fiber [6],
but that’s still fast.
• Light signals of different wavelengths can all cram into one waveguide or fiber. We call this
Wavelength Division Multiplexing (WDM), and results in much higher bandwidth density
(read: smaller cable bundles).
What is PhoenixSim?
PhoenixSim (Photonic and Electronic Network Integration and eXecution Simulator) was orig-
inally designed to allow us to investigate silicon nanophotonic NoCs taking into account com-
ponent’s physical layer characteristics. Because photonics often requires electronic components
around it for control and processing, we also incorporated models of some typical electronic net-
work components. We ended up with a simulation environment that is suited to investigate both
electronic and photonic NoCs.
OMNeT++
PhoenixSim is built on OMNeT++ [22], an environment for creating any event-driven simulator.
Making PhoenixSim event driven means it’s relatively efficient for what it does. However, many
of the events that occur in the simulator are on a clock-cycle granularity, which means it does not
sacrifice too much accuracy. OMNeT supplies functional libraries, mechanisms for instantiating
components, managing parameters, and executing batches of simulations. With the release of
OMNeT 4.0, we also now have an Eclipse-based IDE.
When programming in OMNeT, code is written in C++ that defines how components behave.
Files are all automatically compiled into a single executable, which can then be run as a simulation,
which is supported both through a GUI and on command-line (for batch runs).
Processing Plane
The P ROCESSING P LANE consists of multilpe Processor s, which represent cores or other pro-
cessing elements, each having an instance of an Application class which dictates how communi-
cation events are generated. The structure of the P ROCESSING P LANE varies slightly depending
on if and how concentration is implemented, but generally looks like Figure 2.2. An instance of
a Processor exists to model each independent processing core in the CMP. A Network InterFace
(NIF) translates a Processor ’s communication request event into the network-specific protocol
needed to complete it.
The idea of multiple cores sharing a single logical network node has been established as prob-
ably a good idea for CMPs with a large number of cores to reduce network resources [11]. This
can actually be accomplished in different ways, shown in Figure 2.3.
The first is network-side concentration. This involves modifiying the network switches and
routers to accept multiple injection/ejection points, such as increasing the number of ports (radix)
in an electronic router, as was done in [11]. Network-side concentration requires no changes in the
P ROCESSING P LANE, because each Processor still has one NIF , which it uses to interface to the
increased-radix routers.
DRAM test apps
There are 3 applications to test Processor to DRAM communication. They are One2One, One2All,
and All2One, and do exactly what their names imply. One2One can be used to test the zero-load
latency of a memory access. One2All specifies that one Processor communicates with every
DRAM module, and tests if all memory modules are accessible. All2One specifies that all Pro-
cessor s communicate with a single memory module, testing the contention mechanism. All three
applications use basically the same parameters.
Virtual Channels
We’ve mentioned our support for virtual channels. Currently, they are implemented using separate
physical buffers. To avoid packets from a large application-level messages from arriving at a
destination out of order, we do not allow a packet to change virtual channels in a network. Once
a virtual channel id is assigned to a packet, it stays that way until it gets to where it’s going. The
one exception is when virtual channels are used a priority channels in circut-switching control (not
going to get into that). An electronic network expert might cite this as a severe implementation
flaw, and he might be right, as electronic network performance could be enhanced using fancy flow
control techniques, etc. We’ll be adding support for that kind of thing in the future.
Photonic Devices
Our library of photonic devices comprises of all photonic technologies required to generate, con-
trol, and receive an optical signal. This library of devices can be used to create any number of
switch fabrics and network topologies. Our efforts in building a framework for describing these de-
vices has required us to strike a balance between a physical accuracy and system level performance
simulation. On one hand, we want to be able to model the devices in a physically accurate way
without requiring a full FDTD simulation, on the other hand, we also want to be able to simulate
the devices operating in a network environment to produce meaningful system-level performance
results. This has resulted in the development of a Basic Element abstraction for describing all
photonic devices.
Ring-Based Broadband Switches
Ring devices can also be designed to include thermal or electro-optic control mechanisms for
active control of the device. This can be referred to as a ring switch, since it is not necessary to
used wavelength selectivity to control a signal’s path, but instead can be directed by signaling the
switch itself. Additionally, since wavelength selectivity is no longer used to control the path, the
entire spectrum can be used for the data signal, allowing the use of high bandwidth wavelength
division multiplexed (WDM) signals. [23] [13]
DRAM-LRL
The second model is one that we have developed at L IGHTWAVE R ESEARCH L ABORATORY,
which we call DRAM-LRL, which is used to model a hypothetical memory subsystem design
in the context of circuit-switched networks. Full details can be found in [8]. DRAM-LRL was
made to fit into PhoenixSim more efficiently by staying event-driven, as opposed to DRAMsim
which models every cycle. DRAM-LRL is further simplified by the fact that it is made to be
used in circuit-switched networks. Figure 2.13 shows the components we model in DRAM-LRL.
It shows a Circuit-Accessed Memory Module (CAMM), which has a central OCM Transceiver
which performs O-E-O conversion, and controls the address and data local bus usage according to
the stage of the transaction. Only one transaction need be sustained at a time, due to circuit-paths
having exclusive access to the module.
Addressing
So, how are cores in the network addressed? This was kind of a problem if we wanted to support
many different kinds of networks. Sure, you could give them just some random unique integer, but
it makes it easier on the routing logic if the id’s make sense.
So we came up with a hierarchical addressing scheme. Furthermore, we’ve made it so you can
define the structure of your addresses however you want. We use the concept of address domains,
much like an IP address. In PhoenixSim, the left-most address is considered the ”top”, and the
right-most the ”bottom”. Lets do a simple example.
Electronic Channels
Now that you have some routers, you’ll need to hook them up. The ElectronicChannel module
models many electronic wires in parallel, for one logical channel. These wires are optimally-
repeatered using ORION, based on their length. For now, we don’t have a good way of specifying
the exact layout of all wires in the network. So, the length of any wire can be specified with two
parameters: spaceLengths, and routerLengths. Figure 4.1 illustrates these two parameters.