23-09-2016, 12:05 PM
Hand Gesture Recognition for MP3 Player using
Image Processing Technique and PIC16F8779
1455824330-HandGestureRecognitionforMP3Playerusingimageprocesstechniques.pdf (Size: 256.98 KB / Downloads: 159)
Abstract:--The scope of the project is to control MP3 player using
gesture. Here, gesture image is taken from Web camera and
image will be processed in remote interface using MATLAB
controller. But, the challenging problem is that capturing the
image from external device does not depend on unique only and
Identification of the exact action from an unclear image is not an
easy task. Hence, capturing action from the images is always
puzzling task of separating different sources of images when its
different or noisy. Finally, the images are forwarded to MATLAB
to compare the images with our knowledge database via three
dimension (x, y, and z) readings of a particular object. So if we
move any object in any direction then the corresponding values
are noted by the accelerometer. Most of the music players are
controlled through the remote controls which contain buttons.
But through embedding the PIC16F8779 controller, we can
make music player be controlled by gesture performance in the
air. The application of this three axis controller together with
suitable interfacing with the PIC16F8779 micro controller and
the music player development through coding in software
platform such as MPLab IDE which could recognize the
terminal input instructions and perform functions like play, stop,
play back and play forward of music player controlled by
gesture. We need to move the accelerometer in a particular set of
directions then it will recognize one of the directions like
REWIND, FORWARD, PLAY and STOP and operate the songs
present in the list of music system. Additionally, Karhunen-Loeve
(K-L) Transform is used to capture the image without any noise
and accurate in result and Canny Edge Detection for image
segmentation and edge detection using Principal component
analysis (PCA) which add more value in expected result.
INTRODUCTION
Hand gesture recognition is one of the growing fields of
research today which provides a natural way of human
machine interaction. Gestures are some forms of actions
which a person expresses in order to express information to
others without saying it. In our daily life, we can observe
few hand gestures frequently used for communication
purpose like thumbs up, thumbs down, victory, directions
etc. Some common examples are in cricket where the
umpire uses different hand gestures to show different events
that occurred at that instant on the match, hand gestures used
by the traffic police, etc. Early approaches to the hand
gesture recognition problem in a MP3 control context
involved the use of markers on the different action. An
associated algorithm is used to detect the presence and
colors of the markers, through which one can identify which
action, are matched in the gesture.
The inconvenience of placing markers on the user’s hand
makes this an infeasible approach in practice when the
image is not clear. Recent methods use more advanced
computer vision techniques and do assumption based result.
Hand gesture recognition is performed through a curvature
space method, which involves finding the boundary
contours of the hand. This is a robust approach that involves
scaling, translation and rotation invariant on the hand poses,
yet it is computationally demanding. In our approach, we
have firstly used Skin filtering where the RGB image is
converted to HSV image because this model is more
sensitive to changes in lighting condition. And then K-L
transform is performed. The advantage of K-L transform is
it can eliminate the correlated data, reduces dimensions
keeping average square error minimum, and gives excellent
cluster character after the transform. Some applications in
this field that has already been done, for example hand
gesture recognition for sign language, hand gestures used for
controlling robot’s motion, in video games, etc. The Canny
Edge Detection Algorithm which support this approach in
five different ways for the accurate result in Smoothing,
Finding gradients , Non-maximum suppression , Double
threshold and Edge tracking by hysteresis for fitness result.
First, the input image is taken from camera and image will
be processed in mat lab. In mat lab both the input and
database images are compared, after comparing the images,
result will be given to PIC16C8779 controller with the help
of RS232 cable. The MP3 player is interfaced to the
PIC16C8779 controller and relay drives, the relays act as a
electric switching button, and when the input first image is
recognized with PIC16C8779 controller, the relay will be
automatically on and MP3 player is played. The MP3 player
plays some audio signals only, and the functions are
displayed on to the LCD. The positioners and switches are
controlled remotely using a 40-pin Microchip 16F8779
microcontroller at the monitor station that receives
commands via RS232 and translates them into hardware
control logic shown in figure 2. The microcontroller
connections to the modules that it controls in the monitor
station are shown in Figure 1. This note describes the
conventions adopted for the microcontroller and its
connections to the hardware to be used in writing the
microcontroller assembly language program. Detailed
descriptions of the PIC16C877A.
LITERATURE REVIEW
We have studied many previous works done in this field by
different researchers. There are many approaches that were
followed by different researchers like vision based, data
glove based, Artificial Neural Network, Fuzzy Logic,
Genetic Algorithm, Hidden Markov Model, Support Vector
Machines etc. Some of the previous works are given below.
Many researchers used Vision based approaches for
identifying hand gestures. Kapuscinski [1] found out the
skin colored region from the input image captured and then
this image with desired hand region was intensity
normalized and histogram was found out for the same.
Feature extraction step was performed using Hit-Miss
Transform and the gesture was recognized using Hidden
Markov Model (HMM). Recognition rate obtained was
98%. Yu [2] used YCbCr colour model to Distinguish skin colored pixels from the background. The required portion of
the hand was extracted using this color model and filtered
using median filter and smoothing filter. The edges were
detected and features extracted were hand perimeter, aspect
ratio, hand area after which Artificial Neural Network
(ANN) was used as classifier to recognize a gesture.
Accuracy rate obtained was 97.4%. In [3][8] fingertip
detection was used for hand gesture recognition. Rajam [3]
in his paper for sign language recognition first converted the
RGB image captured to binary and Canny Edge Detection
Technique was used for extracting edge of the palm. The
fingertip positions of the fingers were identified from the
extracted edge of palm by measuring their distance from a
reference point which is taken to be at the bottom of the
palm. Recognition rate obtained was 98.125%. Raheja [8]
scanned the skin filtered image in all direction to find out
the edges of the fingers and the tips of the edges were
assigned the highest pixel value and as such fingertip was
detected. Malima [4] used hand gesture recognition for
controlling the robot. The Red/Green ratio was found out
which was used for determining the skin colored regions.
The centre of gravity of the hand was found out along with
the farthest distance from it and thus in such a way the
finger tips were determined. A circle was made around the
centre of gravity and number of white pixels beyond that
circle was counted to know the desired. For hand-gesture
recognition, some researchers have tried to perform the early
segmentation process using skin-color histograms Zhou et
al. [12] used overlapping sub-windows which is useful to
extract invariants for gesture recognition, and distinguish
them with a local orientation histogram attribute description
indicating the distance from the canonical orientation. This
makes the process relatively robust to noise, however, much
more time consuming indeed. Kuno and Shirai defined
seven different stages of hand gesture recognition. It
includes position of the fingertip. This is not practically
realistic when we have only pointing gestures, but also
several other gestures, like grasping. However, the
invariants they considered inspired us for our defined
invariants. In some similar approaches, the watermark of an
image is generated by modifying the nvariant-vector. For
example, Lizhong Gu and Jianbo Su tried to use Zernike
moments along [12, 13] with a hierarchical classifier to
classify hand-gestures. This method is not appropriate for
the JAST project, since there is not a high degree of freedom
for the hands due to the limited space for movements and
actions.