23-01-2013, 10:49 AM
Buffered Crossbar (CICQ) Switch Architecture
ABSTRACT
The crossbar is the most frequently used switching element topology. It offers simplicity and nonblocking
operation. However, when bufferless, it also requires a centralized scheduler, which must
simultaneously satisfy --in each cell time-- all input and all output link constraints. The cost and
complexity of this scheduler increases considerably for short cell times and for large switch sizes;
additionally, these schedulers cannot practically offer WFQ-type QoS. Furthermore, bufferless crossbars
can only be efficiently used with fixed-size cells arriving from mutually-synchronized line cards; when
we need to switch variable-size packets, we must first segment them into fixed-size cells. To compensate
for the inefficiencies of scheduling and of packet segmentation, internal (crossbar) speedup is used;
commercial crossbars often use a speedup factor of 2 to 3. The net effect is to limit the maximum
external line rate to roughly one half to one third the peak achievable crossbar line rate.
The operation of the crossbar can be dramatically improved by including small buffers at each
crosspoint; CMOS technology has recently reached the point where this is feasible for the buffer sizes
that are needed in order for backpressure flow control to operate efficiently between the crossbar and the
VOQ's in the ingress line cards. This "buffered crossbar" or "combined input-crosspoint queueing
(CICQ)" architecture has significant advantages over the previous, traditional bufferless configuration:
i. The scheduling task is dramatically simplified; WFQ-type QoS is easily implementable; there are
no scheduler inefficiencies to be compensated by speedup.
ii. The crossbar can operate directly on variable-size packets, hence there is no need for
segmentation and reassembly circuits; the need for mutually synchronized line cards (at the celltime
level) is also eliminated.
iii. Internal speedup is not needed, because there is no packet segmentation and no scheduler
inefficiencies; hence, the external line rate can be as high as the crossbar line rate.
iv. The egress path of the switch needs no buffer memory --at least no large