05-10-2012, 03:12 PM
Overview of the H.264/AVC Video Coding Standard
Overview of the H.264AVC.pdf (Size: 904.84 KB / Downloads: 17)
Abstract
H.264/AVC is newest video coding standard of the
ITU-T Video Coding Experts Group and the ISO/IEC Moving
Picture Experts Group. The main goals of the H.264/AVC standardization
effort have been enhanced compression performance
and provision of a “network-friendly” video representation
addressing “conversational” (video telephony) and “nonconversational”
(storage, broadcast, or streaming) applications.
H.264/AVC has achieved a significant improvement in rate-distortion
efficiency relative to existing standards. This article provides
an overview of the technical features of H.264/AVC, describes
profiles and applications for the standard, and outlines the history
of the standardization process.
INTRODUCTION
H.264/AVC is the newest international video coding standard
[1]. By the time of this publication, it is expected to
have been approved by ITU-T as Recommendation H.264 and
by ISO/IEC as International Standard 14 496–10 (MPEG-4 part
10) Advanced Video Coding (AVC).
The MPEG-2 video coding standard (also known as ITU-T
H.262) [2], which was developed about ten years ago primarily
as an extension of prior MPEG-1 video capability with support
of interlaced video coding, was an enabling technology for digital
television systems worldwide. It is widely used for the transmission
of standard definition (SD) and high definition (HD)
TV signals over satellite, cable, and terrestrial emission and the
storage of high-quality SD video signals onto DVDs.
However, an increasing number of services and growing
popularity of high definition TV are creating greater needs
for higher coding efficiency. Moreover, other transmission
media such as Cable Modem, xDSL, or UMTS offer much
lower data rates than broadcast channels, and enhanced coding
efficiency can enable the transmission of more video channels
or higher quality video representations within existing digital
transmission capacities.
NAL Units in Byte-Stream Format Use
Some systems (e.g., H.320 and MPEG-2/H.222.0 systems)
require delivery of the entire or partial NAL unit stream as an ordered
stream of bytes or bits within which the locations of NAL
unit boundaries need to be identifiable from patterns within the
coded data itself.
For use in such systems, the H.264/AVC specification defines
a byte stream format. In the byte stream format, each NAL unit
is prefixed by a specific pattern of three bytes called a start code
prefix. The boundaries of the NAL unit can then be identified by
searching the coded data for the unique start code prefix pattern.
The use of emulation prevention bytes guarantees that start code
prefixes are unique identifiers of the start of a new NAL unit.
A small amount of additional data (one byte per video picture)
is also added to allow decoders that operate in systems that
provide streams of bits without alignment to byte boundaries to
recover the necessary alignment from the data in the stream.
Additional data can also be inserted in the byte stream format
that allows expansion of the amount of data to be sent and can
aid in achieving more rapid byte alignment recovery, if desired.
VCL
As in all prior ITU-T and ISO/IEC JTC1 video standards
since H.261 [3], the VCL design follows the so-called blockbased
hybrid video coding approach (as depicted in Fig. 8), in
which each coded picture is represented in block-shaped units of
associated luma and chroma samples called macroblocks. The
basic source-coding algorithm is a hybrid of inter-picture prediction
to exploit temporal statistical dependencies and transform
coding of the prediction residual to exploit spatial statistical
dependencies. There is no single coding element in the
VCL that provides the majority of the significant improvement
in compression efficiency in relation to prior video coding standards.
It is rather a plurality of smaller improvements that add
up to the significant gain.
Intra-Frame Prediction
Each macroblock can be transmitted in one of several coding
types depending on the slice-coding type. In all slice-coding
types, the following types of intra coding are supported, which
are denoted as Intra_4 4 or Intra_16 16 together with chroma
prediction and I_PCM prediction modes.
The Intra_4 4 mode is based on predicting each 4 4 luma
block separately and is well suited for coding of parts of a
picture with significant detail. The Intra_16 16 mode, on the
other hand, performs prediction of the whole 16 16 luma
block and is more suited for coding very smooth areas of a
picture. In addition to these two types of luma prediction, a
separate chroma prediction is conducted. As an alternative to
Intra_4 4 and Intra_16 16, the I_PCM coding type allows the
encoder to simply bypass the prediction and transform coding
processes and instead directly send the values of the encoded
samples. The I_PCM mode serves the following purposes.
HISTORY AND STANDARDIZATION PROCESS
In this section, we illustrate the history of the standard. The
development of H.264/AVC is characterized by improvements
in small steps over the last 3–4 years as can be seen from Fig. 18.
In Fig. 18, the coding performance is shown for two example
progressive-scan video sequences, when enabling the typical
coding options for the various versions of the standard since
August 1999 until completion in April 2003. The dates and creation
of the various versions are shown in Table I. The document
and the software versions have been called test model long-term
(TML) when being developed in VCEG and joint model (JM)
when the development was continued in the joint video team
(JVT) as a partnership between MPEG and VCEG. The development
took place in small steps between each version of the
design as can be seen from Fig. 18.
CONCLUSIONS
The emerging H.264/AVC video coding standard has
been developed and standardized collaboratively by both the
ITU-T VCEG and ISO/IEC MPEG organizations. H.264/AVC
represents a number of advances in standard video coding
technology, in terms of both coding efficiency enhancement
and flexibility for effective use over a broad variety of network
types and application domains. Its VCL design is based on
conventional block-based motion-compensated hybrid video
coding concepts, but with some important differences relative
to prior standards.