26-07-2012, 03:38 PM
Developing proprietary OCR system
Developing proprietary OCR system is a complicated task and requires a lot of effort.docx (Size: 18.01 KB / Downloads: 21)
Developing proprietary OCR system is a complicated task and requires a lot of effort. Such systems usually are really complicated and can hide a lot of logic behind the code. The use of artificial neural network in OCR applications can dramatically simplify the code and improve quality of recognition while achieving good performance. Another benefit of using neural network in OCR is extensibility of the system – ability to recognize more character sets than initially defined. Most of traditional OCR systems are not extensible enough. Why? Because such task as working with tens of thousands Chinese characters, for example, is not as easy as working with 68 English typed character set and it can easily bring the traditional system to its knees!
Well, the Artificial Neural Network (ANN) is a wonderful tool that can help to resolve such kind of problems. The ANN is an information-processing paradigm inspired by the way the human brain processes information. Artificial neural networks are collections of mathematical models that represent some of the observed properties of biological nervous systems and draw on the analogies of adaptive biological learning. The key element of ANN is topology. The ANN consists of a large number of highly interconnected processing elements (nodes) that are tied together with weighted connections (links). Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. This is true for ANN as well. Learning typically occurs by example through training, or exposure to a set of input/output data (pattern) where the training algorithm adjusts the link weights. The link weights store the knowledge necessary to solve specific problems.
Typewritten OCR
The accurate recognition of Latin-script, typewritten text is now considered largely a solved problem.
Recognition of hand printing, cursive handwriting, and even the printed typewritten versions of some other scripts (especially those with a very large number of characters), are still the subject of active research.
Hand print OCR
Systems for recognizing hand-printed text on the fly have enjoyed commercial success in recent years. Among these are the input device for personal digital assistants such as those running Palm OS. The Apple Newton pioneered this technology. The algorithms used in these devices take advantage of the fact that the order, speed, and direction of individual lines segments at input are known. Also, the user can be retrained to use only specific letter shapes. These methods cannot be used in software that scans paper documents, so accurate recognition of hand-printed documents is still largely an open problem. Accuracy rates of 80% to 90% on neat, clean hand-printed characters can be achieved, but that accuracy rate still translates to dozens of errors per page, making the technology useful only in very limited contexts. This variety of OCR is now commonly known in the industry as "ICR" (intelligent character recognition).
Cursive OCR
Recognition of cursive text is an active area of research, with recognition rates even lower than that of hand-printed text. Higher rates of recognition of general cursive script will likely not be possible without the use of contextual or grammatical information. For example, recognizing entire words from a dictionary is easier than trying to parse individual characters from script. Reading the Amount line of a cheque (which is always a written out number) is an example where using a smaller dictionary can increase recognition rates greatly. Knowledge of the grammar of the language being scanned can also help determine if a word is likely to be a verb or a noun, for example, allowing greater accuracy. The shapes of individual cursive characters themselves simply do not contain enough information to accurately (greater than 98%) recognize all handwritten cursive script.
Music OCR
Early research into recognition of printed sheet music was performed at the graduate level in the mid 1970's at MIT and other institutions. Successive efforts were made to localize and remove musical staff lines leaving symbols to be recognized and parsed. The first commercial music-scanning product, MIDISCAN, was released in 1991. Several commercial products are now available.
MICR
One area where accuracy and speed of computer input of character information exceeds that of humans is in the area of magnetic ink character recognition, where the error rates range around one read error for every 20,000 to 30,000 checks.
Other research areas
A particularly difficult problem for computers and humans is that of old church baptismal and marriage records containing mostly names. The pages may be damaged by age, water or fire and the names may be obsolete or contain rare spellings. Another research area is cooperative approaches, where computers assist humans and vice-versa. Computer image processing techniques can assist humans in reading extremely difficult texts such as the Archimedes Palimpsest or the Dead Sea Scrolls.