Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: Smart Cameras in Embedded Systems pdf
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Smart Cameras in Embedded Systems


[attachment=45616]

ABSTRACT

A smart camera performs real-time analysis to recognize scenic elements.
Smart cameras are useful in a variety of scenarios: surveillance, medicine,
etc.We have built a real-time system for recognizing gestures. Our smart
camera uses novel algorithms to recognize gestures based on low-level
analysis of body parts as well as hidden Markov models for the moves that
comprise the gestures. These algorithms run on a Trimedia processor. Our
system can recognize gestures at the rate of 20 frames/second. The camera
can also fuse the results of multiple cameras

Overview

Recent technological advances are enabling a new generation of smart cameras that
represent a quantum leap in sophistication. While today's digital cameras capture images,
smart cameras capture high-level descriptions of the scene and analyze what they see.
These devices could support a wide variety of applications including human and animal
detection, surveillance, motion analysis, and facial identification.
Video processing has an insatiable demand for real-time performance. Fortunately,
Moore's law provides an increasing pool of available computing power to apply to realtime
analysis. Smart cameras leverage very large-scale integration (VLSI) to provide
such analysis in a low-cost, low-power system with substantial memory. Moving well
beyond pixel processing and compression, these systems run a wide range of algorithms
to extract meaning from streaming video.
Because they push the design space in so many dimensions, smart cameras are a leadingedge
application for embedded system research.

Detection and Recognition Algorithms

Although there are many approaches to real-time video analysis, we chose to focus
initially on human gesture recognition—identifying whether a subject is walking,
standing, waving his arms, and so on. Because much work remains to be done on this
problem, we sought to design an embedded system that can incorporate future algorithms
as well as use those we created exclusively for this application.
Our algorithms use both low-level and high-level processing. The low-level component
identifies different body parts and categorizes their movement in simple terms. The highlevel
component, which is application-dependent, uses this information to recognize each
body part's action and the person's overall activity based on scenario parameters.

High-level processing

The high-level processing component, which can be adapted to different applications,
compares the motion pattern of each body part—described as a spatiotemporal sequence
of feature vectors—in a set of frames to the patterns of known postures and gestures and
then uses several hidden Markov models in parallel to evaluate the body's overall
activity. We use discrete HMMs that can generate eight directional code words that check
the up, down, left, right, and circular movement of each body part.
Human actions often involve a complex series of movements. We therefore combine each
body part's motion pattern with the one immediately following it to generate a new
pattern. Using dynamic programming, we calculate the probabilities for the original and
combined patterns to identify what the person is doing. Gaps between gestures help
indicate the beginning and end of discrete actions.
A quadratic Mahalanobis distance classifier combines HMM output with different
weights to generate reference models for various gestures. For example, a pointing
gesture could be recognized as a command to "go to the next slide" in a smart meeting
room or "open the window" in a smart car, whereas a smart security camera might
interpret the gesture as suspicious or threatening.

Components

Our development strategy calls for leveraging off-the-shelf components to process video
from a standard source in real time, debug algorithms and programs, and connecting
multiple smart cameras in a networked system. We use the 100-MHz Philips TriMedia
TM-1300 as our video processor. This 32-bit fixed- and floating-point processor features
a dedicated image coprocessor, a variable length decoder, an optimizing C/C++ compiler,
integrated peripherals for VLIW concurrent real-time input/output, and a rich set of
application library functions including MPEG, motion JPEG, and 2D text and graphics.

Testbed Architecture

Our testbed architecture, shown in Figure 3, uses two TriMedia boards attached to a host
PC for programming support. Each PCI bus board is connected to a Hi8 camera that
provides NTSC composite video. Several boards can be plugged into a single computer
for simultaneous video operations. The shared memory interface offers higher
performance than the networks likely to be used in VLSI cameras, but they let us
functionally implement and debug multiple-camera systems with real video data.