26-03-2012, 01:14 PM
New Challenges of Character Recognition
Introduction-OCR-2008.pdf (Size: 2.41 MB / Downloads: 113)
Taxnomy of OCR
An OCR (Optical Character Recognition) system
translates images of text into machine-editable
text
n Traditional OCR of scanned documents
Low-quality, low-resolution text recognition:
n Camera-based OCR
n Embedded-text OCR
n Screen-rendered text OCR
OCR
OCR needs techniques from two areas:
n Image processing
Improve image quality
Correct skew/slant errors
Normalize character size
Localize texts by layout analysis
Separation of text from background
Letter/word segmentation
n Pattern classification
(One of the most successful applications)
Scanner-Based OCR
Translate images of handwritten or printed text captured by
a scanner into machine-editable text
n Noise filtering
n Text localization (layout analysis and text segmentation)
n Binarization
n Skew detection and correction
n Slant detection and correction (handwritten text)
n Character scaling
n Segmentation into letters/words
n Pattern classification
n Contextual error correction
Camera-Based OCR
The increasing availability of high performance,
low-priced, portable digital imaging devices
(celluar phones, PDA, etc.) has created
Traditional scanner-based document analysis
techniques provide us with a good reference and
tremendous opportunity for supplementing
traditional scanning for document image
acquisition.
starting point, but they cannot be used directly
on camera-captured images.