23-02-2013, 01:00 PM
High-Throughput LDPC Decoders
High-Throughput LDPC Decoders.pdf (Size: 3.83 MB / Downloads: 46)
Abstract—
A high-throughput memory-efficient decoder architecture
for low-density parity-check (LDPC) codes is proposed
based on a novel turbo decoding algorithm. The architecture
benefits from various optimizations performed at three levels
of abstraction in system design—namely LDPC code design,
decoding algorithm, and decoder architecture. First, the interconnect
complexity problem of current decoder implementations is
mitigated by designing architecture-aware LDPC codes having
embedded structural regularity features that result in a regular
and scalable message-transport network with reduced control
overhead. Second, the memory overhead problem in current
day decoders is reduced by more than 75% by employing a new
turbo decoding algorithm for LDPC codes that removes the
multiple checkto-bit message update bottleneck of the current
algorithm. A new merged-schedule merge-passing algorithm is
also proposed that reduces the memory overhead of the current
algorithm for low to moderate-throughput decoders. Moreover,
a parallel soft-input–soft-output (SISO) message update
mechanism is proposed that implements the recursions of the
Balh–Cocke–Jelinek–Raviv (BCJR) algorithm in terms of simple
“max-quartet” operations that do not require lookup-tables and
incur negligible loss in performance compared to the ideal case.
Finally, an efficient programmable architecture coupled with a
scalable and dynamic transport network for storing and routing
messages is proposed, and a full-decoder architecture is presented.
Simulations demonstrate that the proposed architecture attains
a throughput of 1.92 Gb/s for a frame length of 2304 bits, and
achieves savings of 89.13% and 69.83% in power consumption
and silicon area over state-of-the-art, with a reduction of 60.5% in
interconnect length.
Index Terms—Low-density parity-check (LDPC) codes, Ramanujan
graphs, soft-input soft-output (SISO) decoder, turbo
decoding algorithm, VLSI decoder architectures.
INTRODUCTION
THE PHENOMENAL success of turbo codes [1] powered
by the concept of iterative decoding via message-passing
has rekindled the interest in low-density parity-check (LDPC)
codes which were first discovered by Gallager in 1961 [2].
Recent breakthroughs to within 0.0045 dB of AWGN-channel
capacity were achieved with the introduction of irregular
LDPC codes in [3], [4] putting LDPC codes on par with turbo
codes. However, efficient hardware implementation techniques
of turbo decoders have given turbo codes a clear advantage
Manuscript received September 10, 2002; revised February 3, 2003. This
work was supported by the National Science Foundation under Grant CCR
Digital Object Identifier
over LDPC codes allowing them to occupy mainstream applications
ranging from wireless applications to fiber-optics
communications. Hence, the quest for efficient LDPC decoder
implementation techniques has become a topic of increasing
interest, gradually promoting LDPC codes as serious competitors
to turbo codes on both fronts.
The design of LDPC decoder architectures differs from the
decoder design for other classes of codes, in particular turbo
codes, in that it is intimately related to the structure of the code
itself through its parity-check matrix. The iterative decoding
process of both codes consists of two main steps [5]: 1) computing
independent messages proportional to the a posteriori
probability distributions of the code bits and 2) communicating
the messages. The complexity incurred in both steps depends
on how messages are communicated with respect to the process
of computing the messages. In turbo codes, the communication
mechanism is defined by a simple pseudorandom message interleaver
that is external to the process of computing the messages,
while the computational complexity is proportional to the
code length and memory. On the other hand, the communication
mechanism for LDPC codes is defined by a pseudorandom
bipartite graph and is internal with respect to message computation
(i.e., an internal interleaver), while the computational
complexity is very low ( order of the logarithm of the code
length). In both cases, the performance depends on the properties
of the interleaver: the scrambling of channel burst errors for
turbo codes, and the extremal properties (such as girth and expansion
coefficient) of the bipartite graph defining the code for
LDPC codes.