Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

SEMINAR REPORT ON ANIMATRONICS SHADER LAMP AVATARS

[attachment=48618]

ABSTRACT

This is a new approach for robotic avatars of real people:the use of cameras
and projectors to capture and map the dynamic motion and appearance of a real
person onto a humanoid animatronic model. These devices are called animatronic
Shader Lamps Avatars (SLA).The system prototype comprise of a camera, a tracking
system, a digital projector, and a life-sized styrofoam head mounted on a pan-
tilt unit. The system captures imagery of a moving, talking user and maps the
appearance and motion onto the animatronic SLA, delivering a dynamic, real-time
representation of the user to multiple viewers. Applications such as telepresence
and training involve the display of real or synthetic humans to multiple viewers.
When attempting to render the humans with conventional displays, non-verbal cues
such as head pose, gaze direction, body posture, and facial expression are dicult
to convey correctly to all viewers. In addition, a framed image of a human conveys
only a limited physical sense of presence primarily through the displays location.
While progress continues on articulated robots that mimic humans, the focus has
been on the motion and behavior of the robots rather than on their appearance.

Introduction

There are numerous approaches to visually simulating the presence of a remote
person. The most common is to use 2D video imagery; however, such imagery lacks
a number of spatial and perceptual cues. Even with 3D captured or rendered imagery
and 3D or view-dependent displays, it is dicult to convey information such as body
posture and gaze direction to multiple viewers. Such information can indicate the
intended recipient of a statement, convey interest or attention, and direct facial
expressions and other non-verbal communication. To convey that information to
specic individuals, each participant must see the remote person from his or her
own viewpoint. Providing distinct, view-dependent imagery of a person to multiple
observers poses several challenges. One approach to is to provide distinct tracked
and multiplexed views to each observer, such that the remote person appears in
one common location. However, approaches involving head-worn displays or stereo
glasses are usually unacceptable, given the importance of eye contact between all
(local and remote) participants.

Animatronics

Animatronics is the use of mechatronics to create machines which seem animate
rather than robotic. Animatronic creations include animals (including dinosaurs),
plants and even mythical creatures. A robot designed to be a convincing imitation
of a human is specically known as an android.

Animatronics shader lamp avatars

The term "telepresence" describes technologies that enable activities as diverse as
remote manipulation, communication, and collaboration. Today it is a moniker
embraced by companies building commercial video teleconferencing systems and
by researchers exploring immersive collaboration between one or more participants
at multiple sites. In a collaborative telepresence system, each user needs some
way to perceive remote sites, and in turn be perceived by participants at those
sites. In this paper we focus on the latter challenge-how a user is seen by remote
participants, as opposed to how he or she sees the remote participants. There are
numerous approaches to visually simulating the presence of a remote person. The
most common is to use 2D video imagery; however, such imagery lacks a number of
spatial and perceptual cues. Even with 3D captured or rendered imagery and 3D or
view-dependent displays, it is dicult to convey information such as body posture
and gaze direction to multiple viewers. Such information can indicate the intended
recipient of a statement, convey interest or attention (or lack thereof), and direct
facial expressions and other non-verbal communication. To convey that information
to specic individuals, each participant must see the remote person from his or her
own viewpoint. Providing distinct, view-dependent imagery of a person to multiple
observers poses several challenges. One approach to is to provide distinct tracked
and multiplexed views to each observer, such that the remote person appears in
one common location. However, approaches involving head-worn displays or stereo
glasses are usually unacceptable, given the importance of eye contact between all
(local and remote) participants.

System Components

The components of our proof-of-concept system, as shown in Figure 2, are grouped
at two sites: the capture site and the display site. The capture site is where images
and motions of a human subject are captured. In addition to a designated place
for the human subject, it includes a camera and a tracker, with a tracker target
(a headband) placed onto the human's head as shown in Figure 3 (a).We currently
use a single 640x480 1/3" CCD color camera running at 15 FPS for capturing
imagery. The focus, depth of eld, and eld of view of the camera has been optimized
for being able to comfortably move around in a xed position chair. The NDI
Optotrak system is currently being used for tracking. Future systems may choose
to employ computer-vision-based tracking, obviating the need for a separate tracker
and allowing human motion to be captured without cumbersome tracker targets.
The display site includes a projector, an avatar, and a tracker with a tracker target
(a probe) mounted onto the avatar as shown in Figure 3 (b). The avatar consists
of an animatronic head made of styrofoam that serves as the projection surface,
mounted on a pan-tilt unit that allows moving the head to mimic the movements of
the human at the capture site. The 1024x768 60Hz DLP projector is mounted a few
feet away from the avatar and is congured to only project upon the maximal range
of the mounted avatar; the projector's focus and depth of eld is sucient to cover
the illuminated half of the avatar. Instead of a tracker, future systems may choose
to use position-reporting features of more sophisticated pan-tilt units todetermine
the pose of the styrofoam head.

Camera and Projector Calibration

To calibrate the intrinsic and extrinsic parameters of the camera at the capture site,
we use a custom application built on top of the OpenCV library. We capture multiple
images of a physical checkerboard pattern placed at various positions and orienta-
tions inside the camera's eld of view, and save them to disk. We automatically
detect the 2D coordinates of the corners in each image using the OpenCV cvFind-
ChessboardCorners function. Using the ordered lists of checkerboard corners for
each image, we compute the intrinsic parameters via the OpenCV cvCalibrateCam-
era2 function. We then compute the extrinsic parameters in the tracker coordinate
frame as follows. We rst place the pattern in a single xed position, capture an
image of it and detect the 2D corners in the image as before. Next we use the
tracker probe to capture the 3D locations corresponding to the pattern corners in
the tracker's coordinate frame. Finally, we call the cvFindExtrinsicCameraParams2
OpenCV function using the captured 3D points, the corresponding 2D corner lo-
cations, and the previously computed intrinsic matrix; this produces the camera's
extrinsic matrix in the coordinate frame of the capture-side's tracker. In the case of
this system, these techniques are capable of projection a reprojection error on the
order of a pixel or less.

Head Model Construction

We built our 3D head models (human and animatronic) using Face- Worx, an ap-
plication that takes in two images of a person's head (front and side view), allows
manual identication of distinctive features such as eyes, nose and mouth, and pro-
duces a textured 3D model. The process consists of importing a front and a side
picture of the head to be modeled and adjusting the position of a number of given
control points overlaid on top of each image- see Figure 4 (a,e). The program pro-
vides real-time feedback by showing the resulting 3D model as shown in Figure 4
(b,f). A key property of all FaceWorx models is that they have the same topology,
only the vertex positions dier. This allows a straightforward mapping from one
head model to another. In particular, we can render the texture of a model onto
the shape of another. In Figure 4, the projection-ready model (i) is obtained using
the shape from the avatar head (h) and the texture from the human head ©.

Head Model Calibration

Capturing the human head model and rendering the animatronic head model "ontopof"
the Styrofoam projection surface requires nding their poses in the coordinate frames
of the trackers at each site. Both the human's and avatar's heads are assumed to
have static shape, which simplies the calibration process. The rst step in this
calibration is to nd the relative pose of each head model with respect to a reference
coordinate frame which corresponds to a physical tracker target rigidly attached to
each head being modeled. We use a tracker probe to capture a number of 3D points
corresponding to salient face features on each head, and compute the osets between
each captured 3D point and the 3D position of the reference coordinate frame. Next,
we use a custom GUI to manually associate each computed oset to a corresponding
3D vertex in the FaceWorx model. We then run an optimization process to compute
the 44 homogeneous transformation matrix that best characterizes (in terms of min-
imum error) the mapping between the 3D point osets and the corresponding 3D
vertices in the FaceWorx model.

Animatronic Control

Given a pose for a human head tracked in real time and a reference pose captured
as described in Section 3.3, it is possible to compute a relative orientation. This
orientation constitutes the basis for the animatronic control signals for the avatar.
The pose gathered from the tracker is a 44 orthonormal matrix consisting of rotations
and translations from the tracker's origin. We use the rotation component of the
matrix to compute the roll, pitch, and yaw of the human head. The relative pitch
and yaw of the tracked human are mapped to the pan and tilt capabilities of the
pan-tilt unit and transformed into commands issued to the pan-tilt unit. Using this
process, the avatar emulates the motions of its human "master".

Conclusion

Here by introduced animatronic Shader Lamps Avatars (SLAs), described a proof-
of-concept prototype system, and presented preliminary results. Currently explor-
ing passive vision-based methods for tracking the real persons head , so that can
eliminate the separate tracking system. Also hope to add additional cameras and
projectors. Both will involve the dynamic blending of imagery: as the real person
moves, textures from multiple cameras will have to be dynamically blended and
mapped onto the graphical model, and as the physical Avatar moves, the projec-
tor imagery will have to be dynamically blended (intensity and perhaps also color)
as it is projected. Also considering methods for internal projection. In terms of
the robotics, its exploring possibilities for more sophisticated animation, and more
rigorous motion retargeting methods to address the limitations of the animatronic
components (range and speed of motion, degrees of freedom) while still attempting
human-like performance. Also exploring the design of shape of the Avatars head in
terms of the acceptability of the generic head when compared with a copy of the
users head or some principled average head.

project girl