19-10-2016, 12:26 PM
1459841313-usethisreport.docx (Size: 624.72 KB / Downloads: 7)
INTRODUCTION
Image processing is processing of images using mathematical operations by using any form of signal processing for which the input is an image, a series of images, or a video, such as a photograph or video frame. The output of image processing may be either an image or a set of characteristics or parameters related to the image. Closely related to image processing are computer graphics and computer vision.
1.1 COMPUTER VISION
Computer vision is a branchofartificial intelligenceandimage processing concerned with computer processing of images from the real world. Computer vision is a field that includes methods for acquiring, processing, analyzing, and understanding images and in general, high-dimensional data from the real world in order to produce numerical or symbolic information. As a scientific discipline, computer vision is concerned with the theory and technology for building artificial systems that obtain information from images or multi-dimensional data. The goal of computer vision is to make computers efficiently perceive, process, and understand visual data such as images and videos. Some of the typical computer vision tasks are recognition, motion analysis, scene reconstruction and image restoration. The typical functions which are found in many computer vision systems are described as follows:
1.1.1 Image Acquisition
A digital image is produced by one or several image sensor which, besides various types of light-sensitive cameras, includes range sensors, tomography devices, radar, ultra-sonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence. The pixel values typically correspond to light intensity in one or several spectral bands (gray images or color images), but can also be related to various physical measures, such as depth.
1.1.2 Pre-Processing
Before a computer vision method can be applied to image data in order to extract some specific piece of information, it is usually necessary to process the data in order to assure that it satisfies certain assumptions implied by the method. Examples are: Re-sampling in order to assure that the image coordinate system is correct, Noise reduction in order to assure that sensor noise does not introduce false information, Contrast enhancement to assure that relevant information can be detected, Scale-space representation to enhance image structures at locally appropriate scales.
1.1.3 Feature Extraction
Image features at various levels of complexity are extracted from the image data. Typicalexamples of such features are
i. Lines, edges and ridges.
ii. Localized interest points such as corners, blobs or points.
iii. More complex features may be related to texture, shape or motion.
1.1.4 Detection/Segmentation
At some point in the processing a decision is made about which image points or regions of the image are relevant for further processing.
Examples are
i. Selection of a specific set of interest points.
ii. Segmentation of one or multiple image regions which contain a specific object of interest.
1.1.5 High-Level Processing
At this step the input is typically a small set of data, for example a set of points or an image region which is assumed to contain a specific object. The remaining processing deals with for example,
i. Verification that the data satisfy model-based and application specific assumptions.
ii. Estimation of application specific parameters, such as object pose or object size.
iii. Classifying a detected object into different categories.
1.2 HUMAN COMPUTER INTERACTION
HCI (human-computer interaction) is the study of how people interact with computers and to what extent computers are or are not developed for successful interaction with human beings. The goals of HCI are to produce usable and safe systems, as well as functional systems. The human–computer interface can be described as the point of communication between the human user and the computer. The flow of information between the human and computer is defined as the loop of interaction. The loop of interaction has several aspects to it, including:
i. Visual Based: The visual based human computer inter-action is probably the most widespread area in HCI research.
ii. Audio Based: The audio based interaction between a computer and a human is another important area of in HCI systems. This area deals with information acquired by different audio signals.
iii. Task environment: The conditions and goals set upon the user.
iv. Machine environment: The environment that the computer is connected to, e.g. a laptop in a college student's dorm room.
v. Areas of the interface: Non-overlapping areas involve processes of the human and computer not pertaining to their interaction. Meanwhile, the overlapping areas only concern themselves with the processes pertaining to their interaction.
vi. Input flow: The flow of information that begins in the task environment, when the user has some task that requires using their computer.
vii. Output: The flow of information that originates in the machine environment.
viii. Feedback: Loops through the interface that evaluate, moderate, and confirm processes as they pass from the human through the interface to the computer and back.
ix. Fit: This is the match between the computer design, the user and the task to optimize the human resources needed to accomplish the task.
] A Human–Computer Interface Using Symmetry Between Eyes
A nonintrusive communication interface system called EyeKeys was developed, which runs on a consumer-grade computer with video input from an inexpensive Universal Serial Bus camera and works without special lighting. The system detects and tracks the person's face using multiscale template correlation. The symmetry between left and right eyes is exploited to detect if the person is looking at the camera or to the left or right side. The detected eye direction can then be used to control applications such as spelling programs or games. The system consists of two main modules: 1) the face detector and tracker, and 2) the eye analysis module. The 2-D face tracker locates the scale and position of the face and a region containing the eyes. The eye analysis module then refines the estimate of the location of the eyes and determines if the eyes are looking toward the center, to the left, or to the right of the camera. The output from the eye module can be the input to a simple computer control interface by simulating a key press. The user’s eye movements can be interpreted. While the command is issued, the user is supposed to keep his or her head relatively stable. The system assumes that the user is facing the camera and the computer monitor and has the ability to move his or her eye toward the left and right. The user’s head should be upright and not rotated away from the camera. Although the system can be calibrated for intersession lighting variations, the system assumes constant intra session illumination. Illumination is also assumed to be primarily frontal or uniform on the face.Assistive technology enables people with severe paralysis to communicate their thoughts and emotions. It also allows them to exhibit their intellectual potential. To provide such communication technology, the camera-based human–computer interface EyeKeys was created, which is a new tool to use gaze direction to control the computer. The EyeKeys face tracker combines existing techniques in a new way that allows the face to be tracked quickly as a means to locate the eyes. The method of mirroring and projecting the difference between the eyes is a novel approach to detecting to which side the eyes look. Experiments with EyeKeys have shown that it is an easily used computer input and control device for able-bodied people and has the potential to become a practical tool for people with severe paralysis.
[2] Vision-Based Eye-Gaze Tracking for Human Computer Interface
Eye-gaze is an input mode which has the potential for an efficient computer interface. A small 2D mark is employed as a reference to compensate for the head movement. The iris center has been chosen for the purposes of measuring eye movement. The gaze point is estimated after acquiring the eye movement data. The primary goal is to detect the exact eye position. Two algorithms have been proposed for iris center detection, the Longest Line Scanning and Occluded Circular Edge Matching algorithms. The emphasis is on eye movement and not on face and eye location. The pupil of people having dark or dark-brown eyes can hardly be differentiated from the iris in the captured images. If the image is captured from close range, then it can be used to detect the pupil even under ordinary lighting conditions. Due to the fact that the sclera is light and the iris is dark, this boundary can easily be optically detected and tracked. There are some issues, however, which have to be emphasized. They arise, due to the Coverage of the top and bottom of the limbus by the eyelids and excessive coverage of the eyes by eyelids.
i. Longest Line Scanning (LLS)
Human eyes have three degrees of freedom of rotation in 3D space. Actually, the eye image is a projection of the real eye. The iris is nearly a circular plate attached to the approximately spherical eyeball. The projection of the iris is elliptical in shape. The center of an ellipse lies on the center of the longest horizontal line inside the boundary of the ellipse. The LLS algorithm is an application of this property. It can be applied to the problem of detection of the iris center.
ii. Occluded Circular Edge Matching (OCEM)
Although the LLS method detects the center of the iris, it is not sufficient for measuring eye-gaze precisely. The problems in LLS algorithm are intra-iris noise, rough iris edge, occlusion of longest line by eyelids. The only clues to find the center of the iris are left and right edge pixels of the iris boundary, the so called limbus. In order to estimate the original position and shape of the iris boundary, the circular edge matching(CEM) method can be adapted. The iris is naturally occluded by eyelids to some extent, depending upon the individual or the status of the subject. CEM should be adaptively modified. Only the visible portions of the edge without occluded portions need to be processed in the matching step. The angle of rotation of the eyeball and the eccentricity of the ellipse are not large, when the subject sits and operates the computer in front of the screen. This observation justifies a circular approximation to the ellipse. The system determines gaze point by linear approximation. However, a small number of failures result from large head movements and error in eye movement tracking. Some solutions are using camera with higher resolution, placing the camera closer to subject’s face, employing two cameras, one for head tracking, the other for eye movement tracking.
[3] Communication via eye blinks and eyebrow raises: video-based human-computer interface
Two video-based human-computer interaction tools are introduced that can activate a binary switch and issue a selection command. “Blink Link,” as the first tool is called, automatically detects a user’s eye blinks and accurately measures their durations. The system is intended to provide an alternate input modality to allow people with severe disabilities to access a computer. Voluntary long blinks trigger mouse clicks, while involuntary short blinks are ignored. The second tool, “Eye-brow Clicker,” automatically detects when a user raises his or her eyebrows and then triggers a mouse click.The goal is to develop computer vision systems that make computers perceptive to a user’s natural communicative cues such as gestures, facial expressions, and gaze direction.The system design can be broken down into four steps, motion analysis for the purpose of locating the eyes, eye tracking, blink detection and length measurement, and interpretation. The eyes are located automatically by considering motion information between two consecutive video frames and determining if this motion is likely to be caused by a blink.A grayscale template is extracted from the blink location of one eye. The eye is tracked and constantly monitored to establish to what extent it is open or closed at each frame. A blink’s duration is defined as the count of consecutive frames of closure. If at any time the eye tracker is believed to be lost, then it is reinitialized by repeating motion analysis on the subsequent involuntary blinks.
[4] Real-Time Iris Detection on Coronal-Axis-Rotated Faces
Real-time face and iris detection on video sequences is important in diverse applications such as, study of the eye function, drowsiness detection, virtual keyboard interfaces, face recognition, and multimedia retrieval. In this paper, a real-time robust method is developed to detect irises on faces with coronal axis rotation within the normal range. The method allows head movements with no restrictions to the background. The method is based on anthropometric templates applied to detect the face and eyes. The templates use key features of the face such as the elliptical shape, and location of the eyebrows, nose, and lips. For iris detection, a template following the iris-sclera boundary shape is used. A new method is proposed for real-time iris detection on faces that can move within the field of view of the camera, and can rotate around the coronal axis. The method uses anthropometric data about faces and eyes to guide the face and eye detection algorithm, thus, limiting the number of computations to allow real-time processing. Three different templates are built to allow face detection, limit the eye search region, and allow detection of the irissclera boundary. The method for iris tracking is built into three stages requiring only gray level information.The three stages are coarse face detection, fine face detection, and iris detection. The goal of coarse face detection stage is to find the approximate face locationwithin the image. The fine face detection stage is designed to detect the face location determining the face rotation angle and size. In the iris detection stage, the eyes’ relative location within the face limits are measured. Four limits for the eyes were determined using the ear tragus as a reference point representing the middle of the face and the chin representing the face bottom limit. The four eye limits were the upper and lower eyelid limits, and the external and internalcorners. The region delimited by these measurements represents Ier(i, j), and therefore, a template was built with these values. Thus, once the fine face location has been performed, Ier(i, j) is automatically determined. The regions where the eyes are located maintain their relative position in rotated templates. The templates are rotated in angle f, and are created only once and stored for later use.Once the Ier(i, j) has been determined, the position of the iris is detected by first eliminating the bright reflections on the pupil or iris present in most eye images. The method can follow head movements with no restrictions to the background, and it is based on anthropometrical templates built with information from the face and eye region from the individuals.
[5] Eye Tracking and Head Movement Detection: A State-of-Art Survey Eye-gaze detection and tracking have been an active research field in the past years as it adds convenience to a variety of applications. It is considered a significant untraditional method of human computer interaction. Head movement detection has also received researchers' attention and interest as it has been found to be a simple and effective interaction method. Both technologies are considered the easiest alternative interface methods. They serve a wide range of severely disabled people who are left with minimal motor abilities. For both eye tracking and head movement detection, several different approaches have been proposed and used to implement different algorithms for these technologies. Despite the amount of research done on both technologies, researchers are still trying to find robust methods to use effectively in various applications. This paper presents a state-of-art survey for eye tracking and head movement detection methods proposed in the literature. Examples of different fields of applications for both technologies, such as human-computer interaction, driving assistance systems, and assistive technologies are also investigated. The system detects pupil motion in the video frames by finding the difference between the bright and dark pupil images. Head direction is detected by tracing key feature points (nostrils). The nostrils were detected as the darker areas in the bright and dark images. The information obtained is mapped into cursor motion on a display. Eye tracking and head movement detection are considered effective and reliable human-computer interaction and communication alternative methods Head movement detection requires high computational hardware. A microcontroller, which is considered low computational hardware, cannot be used for implementing head movement detection algorithms reported in literature.
SYSTEM ANALYSIS
3.1 PROBLEM STATEMENT
This project is developed for people with severe disabilities who cannot control computer using their hands. Earlier system uses hardware for eye tracking and hence it is intrusive. Some systems used two cameras for eye tracking which was expensive. This system resolves such problems. The only hardware used is a web cam. So the proposed system is non-intrusive. The system also eliminates the needfor two cameras.
3.2 PROPOSED SYSTEM
We consider the problem of tracking faces using a video camera. This is the system which use a video camera to track user’s face position in 3D in order to convert it to a position of a cursor or another virtual object in 2D screen. The concept of the second order change detection, which allows one to discriminate the local (most recent) change in image, such as blink of the eyes, is introduced. This concept sets the base for designing complete face-operated control systems, in which, using the analogy with mouse, “pointing” is performed by nose and “clicking” is performed by blinking of the eyes. Edge-based features create a problem in tracking the objects which may rotate, since these features are not invariant to the rotation and the change of scale of the object. In order to select a robust facial feature, we use the pattern recognition paradigm of treating features. According to this paradigm, a feature is associated with a vector made of feature attributes. Feature attributes can be pixel intensities or they can be the parameters of geometric primitives.
CHAPTER 4
SYSTEM SPECIFICATION
4.1 HARDWARE SPECIFICATION
The computer hardware required for developing and executing the project are
Processor : INTEL-PENT-IV
Memory : 512 Mb DDR1(Minimum).
Hard Disk : 10 GB (Minimum).
Camera : 30 Frames supportable Web.
4.2 SOFTWARE SPECIFICATION
The project was developed and executed using the following software.
Operating System : Windows XP
Java packages : jdk1.6, jvm 2.0
Our system uses a video camera to track user’s face position in 3D in order to convert it to a position of a cursor or another virtual object in 2D screen. The eyes and nose are located and then we start to track them by analyzing the movement of these features in subsequent frames, using some template matching, and heuristics. The system uses the light reflection in the users’ eyes in order to locate the pupils. The system then checks if there is any wink in users’ eye. If there is any wink detected, then it checks if the left eye has winked and performs the left mouse button action else the right mouse button action is performed correspondingly. The system then checks if there is any movement in users’ pupil. If any movement is detected, then it checks if the left eye has moved and moves the pointer left else moves the pointer right correspondingly.
5.2 SYSTEM ARCHITECTURE
The input to the system is a video sequence obtained from a webcam. The image frame is constructed from the video. Each frame is processed into a gray scale image. The facial candidates are located and the BTE templates are extracted using SSR filter. The face tracker analyses the facial movements using SVM algorithm.