13-05-2014, 01:04 PM
Offline Handwritten Devanagari Script Segmentation
Offline Handwritten Devanagari.pdf (Size: 349.32 KB / Downloads: 38)
Abstract
The process of segmentation is a vital part in any script/character recognition technique. Devanagari is mostly useful Script in
India for number of officials and banking applications. Segmentation of Devanagari script is difficult because of presence of large character
set which include vowels, consonants, compound characters and modifiers. This paper focus on the line, word, character segmentation of
handwritten Devanagari script for efficient script recognition.
.INTRODUCTION
Segmentation of handwritten script is very important task for
post processing of handwritten script recognition.
Segmentation is important to improve the accuracy of
handwritten script identification, since recognition system is
heavily depends upon segmentation phase. Segmentation
means to subdivide a handwritten script image into a particular
part such as line, word or character. Basically in the
segmentation approach it has been tries to extract a specific
part of handwritten Devanagari script document images. The
large variation in handwriting style of the script makes the task
of segmentation quite difficult.
CHARACTERISTICS OF DEVANAGARI
Devanagari originated from ancient Brahmi script through
various transformations. As there is typically a letter for each
of the phonemes in Devanagari, the alphabet set tends to be
quite large. Devanagari script has 13 vowels, 34 consonants,
14 modifiers of vowels and of rakars. Also Devanagari having
compound characters which are formed by combining two or
more basic characters. It is a phonetic and syllabic script,
words are written exactly as they are pronounced. Apart from
the above features another distinctive feature of Devanagari is
the presence of a horizontal line on the top of all characters.
This line is known as header line or maatra. It also contains
the ten numerals whose combinations in different way can
form large set of numbers used to describe amount. Following
figure 1 will show example of the Devanagari script.
PROPOSED SEGMENTATION
METHODOLOGY
followed
by one
space. It is good practice
to briefly explain the
The significance
process of
includes separating word, line,
of segmentation
the figure in the caption.
individual character or pseudocharacter images from a given
script image. The large variation in handwriting style and of the
script makes the task of segmentation quite difficult. Before
proceeding to the process of segmentation preprocessing is
need to be done. In the preprocessing smoothing of image
using median filter and the binarization of image and scaling is
include. In this approach of segmentation, the header line is
present in the input script. The process of segmentation
consists in analyzing the digitalized image provided by a
scanning device, so as to localize the limits of each character
and to isolate them from each other. In the handwritten
Devanagari script the space between the words and the
characters may varies which may produce some difficulties in
the segmentation process of handwritten Devanagari script.
For the line segmentation connected component approach is
used. Clustering the connected components to extract the
line. For the segmentation of the handwritten Devanagari
script into words, vertical projection profile i.e. the histogram of
input image, where the zero valley peaks shows the space
between the words and characters. Finding the maximum
character space and used it for separating the words. For the
character segmentation of Devanagari, using the vertical
profile to separate the base character using clear paths
between them [3]. The steps of proposed algorithm for
segmentation of handwritten Devanagari script is follows. The
algorithm is simulated in Matlab 7.0