07-12-2012, 01:15 PM
Low-Complexity Tracking-Aware H.264 Video Compression for Transportation Surveillance
Low-Complexity Tracking-.pdf (Size: 1.17 MB / Downloads: 28)
Abstract
In centralized transportation surveillance systems,
video is captured and compressed at low processing power remote
nodes and transmitted to a central location for processing. Such
compression can reduce the accuracy of centrally run automated
object tracking algorithms. In typical systems, the majority of
communications bandwidth is spent on encoding temporal pixel
variations such as acquisition noise or local changes to lighting.
We propose a tracking-aware, H.264-compliant compression
algorithm that removes temporal components of low tracking
interest and optimizes the quantization of frequency coefficients,
particularly those that most influence trackers, significantly reducing
bitrate while maintaining comparable tracking accuracy.
We utilize tracking accuracy as our compression criterion in lieu
of mean squared error metrics. Our proposed system is designed
with low processing power and memory requirements in mind,
and as such can be deployed on remote nodes. Using H.264/AVC
video coding and a commonly used state-of-the-art tracker we
show that our algorithm allows for over 90% bitrate savings
while maintaining comparable tracking accuracy.
Introduction
VIDEO imaging sensors are commonly used in transportation
monitoring and surveillance. Such sensors are
a cost effective solution that yields information on a large
field of view, allowing for real time monitoring of video
feeds and video archiving for forensic, surveillance and traffic
analysis applications. Other vehicular monitoring solutions,
such as embedded inductor cables or radars, can only identify
and count vehicles and measure instantaneous speed without
providing any further information. Video imaging is the only
existing modality that observes a vehicle’s complete trajectory,
opening the door to a completely different set of applications
[1]. Possible applications include the remote surveillance of
transportation hubs, automatic detection of anomalies, and
study of transportation phenomena such as driver behavior and
its possible effects on safety and congestion [2]–[4].
Proposed System
We propose a system using application-specific video compression
to minimize the bandwidth requirement for links
connecting central and remote nodes. This is done by minimizing
bits spent coding components of low tracking interest,
specifically: 1) temporal pixel variations such as local changes
to illumination which are not useful to trackers in general,
and 2) frequency components that are less valuable to the
specific tracker being used. While 1) is achieved in real-time
by estimating pixel-level statistics in input video and filtering
out small pixel-level fluctuations (details presented in Section
IV), 2) is achieved by optimizing a quantization table specific
to the automated tracker used (details presented in Section
V). This allows us to affect implicitly rate-distortion level
decisions without requiring the presence of a tracker at the
encoder.
Measuring Automated Tracking Efficiency
The field of video object tracking is quite active, with various
solutions offering strength/weakness combinations suitable
for different applications. For urban transportation video
tracking, most applications involve a background subtraction
component for target acquisition such as the one developed in
[20], and an inter-frame object association component such
as those developed in [21] and [22]. Most such tracking
algorithms account only for the native statistics of video
objects, and as a result distortion of these statistics by sources
such as compression may severely degrade their accuracy.
In order to optimize tracking quality a metric to measure
tracking accuracy is required. In [23], a review of the state-ofthe-
art for video surveillance performance metrics is presented.
While more complex metrics such as the ones presented in
[24] may be used, due to their pertinence to transportation
surveillance, we choose the overlap, precision, and sensitivity
metrics presented in [23], with the ground truth defined as
tracking results generated using uncompressed video.
Iterative Quantization Table Search
In this section, we propose an iterative greedy search
algorithm which automatically identifies and concentrates bit
allocation to frequencies useful to tracking. During each
iteration, the encoder quantization scheme of each individual
frequency is modified, and tracking accuracy is measured for
a sample clip of the video. From these results, only those
frequencies which provide the highest tradeoff of bits for
tracking accuracy are kept, and subsequent iterations proceed
cumulatively. This algorithm aims to make encoder quantization
decisions based on tracking accuracy as opposed to the
traditional rate-distortion method.
Conclusion
In this paper, we proposed a combined video processing
and iterative quantization table search algorithm that
removes elements of low tracking interest as part of the
video compression system. We proposed three alternatives
for system initialization, each appropriate for systems with
different requirements. Using H.264/AVC video coding and a
commonly used tracker, we showed that while maintaining
comparable tracking accuracy our system allows for over
90% bitrate savings on the video link from remote nodes in
centralized transportation surveillance systems. While in this
paper we focused on transportation surveillance applications,
the algorithms presented herein can readily be extended to
other surveillance scenarios (such as maritime, rail, commuter
terminals), and generalized applications whenever similar requirements
(limited resources) and assumptions (static camera,
motion of large objects, tracking awareness) are in place.