04-08-2012, 12:12 PM
Optical Character Recognition
OCR.docx (Size: 206.22 KB / Downloads: 40)
Preface
this seminar is simple study for the optical character recognition (OCR), by more common methods which are template matching and correlation techniques, features derived from the statistical distribution of points ,geometrical and topological features ,hybrid approach and neural networks ,that in character recognition methods ,so this seminar maybe useful for who went get basic information in this term, because written in OCR implementation the process and classification process, under classification process two steps, training and testing.
Some of important concepts in the concept of optical character recognition
1. Concept of Optical Character Recognition (OCR)
1.1 - Optical Character Recognition (OCR)
Is a method to learn computers recognize characters by using any type of recognition method and sometimes used in signature recognition which is used in bank,important company and other high security buildings
OCR is known to be used in radar systems for reading speeders license plates and lot other things
1.2- Optical Character Recognition Work
There are two basic methods used for (OCR) Matrix matching and feature extraction. of the two ways to recognize characters, matrix matching is the simpler and more common.
Matrix Matching compares what the OCR scanner sees as a character with a library of character matrices or templates. When an image matches one of these prescribed matrices of dots within a given level of similarity, the computer labels that image as the corresponding ASCII character.
Feature Extraction is OCR without strict matching to prescribed templates. Also known as Intelligent Character Recognition (ICR), or Topological Feature Analysis, this method varies by how much "computer intelligence" is applied by the manufacturer. The computer looks for general features such as open areas, closed shapes, diagonal lines, line intersections, etc. This method is much more versatile than matrix matching. Matrix matching works best when the OCR encounters a limited repertoire of type styles, with little or no variation within each style. Where the characters are less predictable, feature, or topographical analysis is superior.
1.3 - OCR Fonts
font is the term given to a set of characters, usually 0 - 9, A through Z, and a few special characters. Each character within a font will have a defined reproducible size and shape. For OCR, these are defined by ANSI, the American National Standards Institute.
OCR fonts, or characters, that can be read by the lower speed, lower cost systems we are discussing here require well defined character shapes that are very reproducible and designed to be both machine and human readable. These unique and well defined character sets allow for greater accuracy.
1.4 - OCR Scanners
OCR reading devices are fundamentally classified with two categories, Text Input and Data Capture.
Text input devices are page readers or document scanners that scan entire documents or large portions of documents. The source data is entered with the intention of someone editing it during or after it is scanned. Text input devices have varying degrees of automation from hand fed to having automatic feeding, reading, sorting, and stacking capabilities.
Data Capture devices are designed to capture repetitive data and to perform formatting functions on the data as it is being entered. The data delivered from the scanner to the computer must be very accurate because it is entered without the intention of being edited later, so accuracy must be higher than text input.
1.5 - Elements of a Successful OCR Application
The elements of a successful OCR installation include:
1.5.1 Proper Media
1.5.2 Forms Design
1.5.3 Data Integrity and Output Processing
1.5.4 OCR Reader
1.6 - Reasons for Using OCR
There are a number of reasons for choosing OCR scanning over other methods of data entry. Some of the more significant include:
- To reduce Data Entry Errors
- To Consolidate Data Entry
- To Handle Peak Loads
- Human Readable
- Can Be Used with Many Printing Techniques
- Scanning Corrections
2.Character Recognition Methods
2.1 - Template Matching and Correlation Techniques
In 1929 Tausheck obtained a patent on OCR in Germany and this is the first conceived idea of an OCR. Their approach was, what is referred to as template matching in the literature. The template matching process can be roughly divided into two sub processes, i.e. superimposing an input shape on a template and measuring the degree of coincidence between the input shape and the template. The template, which matches most closely with the unknown, provides recognition. The two-dimensional template matching is very sensitive to noise and difficult to adapt to a different font. A variation of template matching approach is to test only selected pixels and employ a decision tree for further analysis. Peephole method is one of the simplest methods based on selected pixels matching approach. In this approach, the main difficulty lies in selecting the invariant discriminating set of pixels for the alphabet. Moreover, from an Artificial Intelligence perspective, template matching has been ruled out as an explanation for human performance [1, 2].
2.2 - Features Derived from the Statistical Distribution of Points
This technique is based on matching on feature planes or spaces, which are distributed on an n-dimensional plane where n is the number of features. This approach is referred to as statistical or decision theoretic approach. Unlike template matching where an input character is directly compared with a standard set of stored prototypes. Many samples of a pattern are used for collecting statistics. This phase is known as the training phase. The objective is to expose the system to natural variants of a character. Recognition process uses this statistics for identifying an unknown character. The objective is to expose the system to natural variants of a character. The recognition process uses this statistics for partitioning the feature space. For instance, in the K-L expansion one of the first attempt in statistical feature extraction, orthogonal vectors are generated from a data set. For the vectors, the covariance matrix is constructed and its eigenvectors are solved which form the coordinates of the given pattern space. Initially, the correlation was pixel-based which led to large number of covariance matrices. This approach was further refined to the use of class-based correlation instead of pixel-based one which led to compact space size. However, this approach was very sensitive to noise and variation in stroke thickness. To make the approach tolerant to variation and noise, a tree structure was used for making a decision and multiple prototypes were stored for each class. Researchers for classification have used the Fourier series expansions, Walsh, Haar, and Hadamard series expansion.
2.3 - Geometrical and Topological Features
The classifier is expected to recognize the natural variants of a character but discriminate between similar looking characters such as ‘k’ – ‘ph’, ‘p’ - ‘Sh’ etc. This is a contradicting requirement which makes the classification task challenging. The structural approach has the capability of meeting this requirement. The multiple prototypes are stored for each class, to take care of the natural variants of the character. However, a large number of prototypes for the same class are required to cover the natural variants when the prototypes are generated automatically. In contrast, the descriptions may be handcrafted and a suitable matching strategy incorporating expected variations is relied upon to yield the true class. The matching strategies include dynamic programming, test for isomorphism, inexact matching, relaxation techniques and multiple to-one matching. Rocha have used a conceptual model of variations and noise along with multiple to one mapping. Yet another class of structural approach is to use a phrase structured grammar for prototype descriptions and parse the unknown pattern syntactically using the grammar. Here the terminal symbols of the grammar are the primitives of strokes and non-terminals represent the pattern-classes. The production rules give the spatial relationships of the constituent primitives.
2.4 - Hybrid Approach
The statistical approach and structural approach both have their advantages and shortcomings. The statistical features are more tolerant to noise (provided the sample space over which training has been performed is representative and realistic) than structural descriptions. Whereas, the variation due to font or writing style can be more easily abstracted in structural descriptions. Two approaches are complimentary in terms of their strengths and have been combined. The primitives have to be ultimately classified using a statistical approach. Combine the approaches by mapping variable length, unordered sets of geometrical shapes to fixed length numerical vectors. This approach, the hybrid approach, has been used for Omni font, variable size character recognition systems.
2.5 - Neural Networks
In the beginning, character recognition was regarded as a problem, which could be easily solved. But the problem turned out to be more challenging than the expectations of most of the researchers in this field. The challenge still exists and an unconstrained document recognition system matching human performance is still nowhere in the sight. The performance of a system deteriorates very rapidly with deterioration in the quality of the input or with the introduction of new fonts handwriting. In other words, the systems do not adapt to the changed environment easily. Training phase aims at exposing the system to a large number of fonts and their natural variants. The neural networks are based on the theory of learning from the known inputs. A back propagation neural network is composed of several layers of interconnected elements. Each element computes an output, which is a function of weighted sum of its inputs. The weights are modified until a desired output is obtained. The neural networks have been employed for character recognition with varying degree of success. The neural networks are employed for integrating the results of the classifiers by adjusting weights to obtain desired output. The main weakness of the systems based on neural networks is their poor capability for generality. There is always a chance of under training or over training the system. Besides this, a neural network does not provide structural description, which is vital from artificial intelligence viewpoint. The neural network approach has solved the problem of character classification no more than the earlier described approaches. The recent research results call for the use of multiple features and intelligent ways of combining them. The combination of potentially conflicting decisions by multiple classifiers should take advantage of the strength of the individual classifier, avoid their weaknesses and improve the classification accuracy. The intersection and union of decision regions are the two most obvious methods for classification combination.