Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USIN
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USING NEURAL NETWORK PROJECT REPORT


[attachment=67186]


ABSTRACT

An off-line handwritten alphabetical character recognition system using multilayer feed forward neural
network is described in the paper. A new method, called, diagonal based feature extraction is introduced
for extracting the features of the handwritten alphabets. Fifty data sets, each containing 26 alphabets
written by various people, are used for training the neural network and 570 different handwritten
alphabetical characters are used for testing. The proposed recognition system performs quite well
yielding higher levels of recognition accuracy compared to the systems employing the conventional
horizontal and vertical methods of feature extraction. This system will be suitable for converting
handwritten documents into structural text form and recognizing handwritten names.


INTRODUCTION

Handwriting recognition has been one of the most fascinating and challenging research areas in
field of image processing and pattern recognition in the recent years [1] [2]. It contributes
immensely to the advancement of an automation process and can improve the interface between
man and machine in numerous applications. Several research works have been focusing on new
techniques and methods that would reduce the processing time while providing higher
recognition accuracy [3].
In general, handwriting recognition is classified into two types as off-line and on-line
handwriting recognition methods. In the off-line recognition, the writing is usually captured
optically by a scanner and the completed writing is available as an image. But, in the on-line
system the two dimensional coordinates of successive points are represented as a function of
time and the order of strokes made by the writer are also available. The on-line methods have
been shown to be superior to their off-line counterparts in recognizing handwritten characters
due to the temporal information available with the former [4] [5]. However, in the off-line
systems, the neural networks have been successfully used to yield comparably high recognition
accuracy levels .Several applications including mail sorting, bank processing, document reading
and postal address recognition require off-line handwriting recognition systems. As a result, the
off-line handwriting recognition continues to be an active area for research towards exploring
the newer techniques that would improve recognition accuracy [6] [7].
The first important step in any handwritten recognition system is pre-processing followed by
segmentation and feature extraction. Pre-processing includes the steps that are required to shape
the input image into a form suitable for segmentation [8]. In the segmentation, the input image
is segmented into individual characters and then, each character is resized into m x n pixels
towards the training network.


THE PROPOSED RECOGNITION SYSTEM

In this section, the proposed recognition system is described. A typical handwriting recognition
system consists of pre-processing, segmentation, feature extraction, classification and
recognition, and post processing stages. The schematic diagram of the proposed recognition
system is shown in Fig.1


Pre-processing

The pre-processing is a series of operations performed on the scanned input image. It essentially
enhances the image rendering it suitable for segmentation. The various tasks performed on the
image in pre-processing stage are shown in Fig.2. Binarization process converts a gray scale
image into a binary image using global thresholding technique. Detection of edges in the
binarized image using sobel technique, dilation the image and filling the holes present in it are
the operations performed in the last two stages to produce the pre-processed image suitable for
segmentation [16].


PROPOSED FEATURE EXTRACTION METHOD

In this stage, the features of the characters that are crucial for classifying them at recognition
stage are extracted. This is an important stage as its effective functioning improves the
recognition rate and reduces the misclassification [17]. Diagonal feature extraction scheme for
recognizing off-line handwritten characters is proposed in this work. Every character image of
size 90x 60 pixels is divided into 54 equal zones, each of size 10x10 pixels (Fig.3©). The
features are extracted from each zone pixels by moving along the diagonals of its respective
10X10 pixels. Each zone has19 diagonal lines and the foreground pixels present long each
diagonal line is summed to get a single sub-feature, thus 19 sub-features are obtained from the
each zone. These 19 sub-features values are averaged to form a single feature value and placed
in the corresponding zone (Fig.3 (b)). This procedure is sequentially repeated for the all the
zones. There could be some zones whose diagonals are empty of foreground pixels. The feature
values corresponding to these zones are zero. Finally, 54 features are extracted for each
character. In addition, 9 and 6 features are obtained by averaging the values placed in zones
rowwise and columnwise, respectively. As result, every character is represented by 69, that is,
54 +15 features.


CLASSIFICATION AND RECOGNITION

The classification stage is the decision making part of a recognition system and it uses the
features extracted in the previous stage. A feed forward back propagation neural network having
two hidden layers with architecture of 54-100-100-38 is used to perform the classification. The
hidden layers use log sigmoid activation function, and the output layer is a competitive layer, as
one of the characters is to be identified. The feature vector is denoted as X where
X = (f1, f2,…,fd) where f denotes features and d is the number of zones into which each character
is divided. The number of input neurons is determined by length of the feature vector d. The
total numbers of characters n determines the number of neurons in the output layer. The number
of neurons in the hidden layers is obtained by trial and error. The most compact network is
chosen and presented.


RESULTS AND DISCUSSION

The recognition system has been implemented using Matlab7.1.The scanned image is taken as
dataset/ input and feed forward architecture is used. The structure of neural network includes an
input layer with 54/69 inputs, two hidden layers each with 100 neurons and an output layer with
26 neurons. The gradient descent back propagation method with momentum and adaptive
learning rate and log-sigmoid transfer functions is used for neural network training. Neural
network has been trained using known dataset. A recognition system using two different feature lengths is built. The number of input nodes is chosen based on the number of features. After
training the network, the recognition system was tested using several unknown dataset and the
results obtained are presented in this section.
Two approaches with three different ways of feature extraction are used for character
recognition in the proposed system. The three different ways of feature extraction are horizontal
direction, vertical direction and diagonal direction.
In the first approach, the feature vector size is chosen as 54, i.e. without rowwise and
columnwise features. The results obtained using three different types of feature extraction are
summarized in Table. I. The criteria for choosing the type of feature extraction are: (i) the speed
of convergence, i.e. number of epochs required to achieve the training goal and (ii) training
stability. However, the most important parameter of interest is the accuracy of the recognition
system. The results presented in Table 1 show that the diagonal feature extraction yields good
recognition accuracy compared to the others types of feature extraction. Fig.5.shows the Error
(MSE) vs. Training Epochs performance function of the network with 54 features obtained
though diagonal extraction. The desired performance goal has been achieved in 923 epochs


IMPLEMENTATION ON GRAPHICAL USER INTERFACE

A user-friendly front end interface as shown in Fig.7 and Fig.8 has been implemented for the
proposed handwritten character recognition system using menu based GUI (Graphical User
Interface). The interface system presents the user with two menus - first menu with five
processing stages (Fig.7) and the second menu to choose the type feature extraction (Fig.8). The menu based GUI enables the user to perform pre-processing, select the type of feature
extraction, perform the feature extraction using the chosen method and train the network.


CONCLUSION

A simple off-line handwritten English alphabet characters recognition system using a new type
of feature extraction, namely, diagonal feature extraction is proposed. Two approaches using 54
features and 69 features are chosen to build the Neural Network recognition system. To
compare the recognition efficiency of the proposed diagonal method of feature extraction, the
neural network recognition system is trained using the horizontal and vertical feature extraction
methods. Six different recognition networks are built. Experimental results reveals that 69
features gives better recognition accuracy than 54 features for all the types of feature extraction.
From the test results it is identified that the diagonal method of feature extraction yields the
highest recognition accuracy of 97.8 % for 54 features and 98.5% for 69 features. The diagonal
method of feature extraction is verified using a number of test images. The proposed off-line
hand written character recognition system with better-quality recognition rates will be eminently
suitable for serval applications including postal/parcel address recognition, bank proecssing,
document reading and conversion of any handwritten document into structural text form.