22-08-2014, 10:26 AM
AN ADVANCED PREPROCESSING TECHNIQUE FOR OPTICAL CHARACTER RECOGNITION IN DEGRADED DOCUMENT IMAGE
AN ADVANCED.ppt (Size: 288 KB / Downloads: 29)
Abstract
To Segment the text from badly degraded Document
The Image Binarization Technique
- Noise reduction
- Adaptive Image contrast
- Edge detection
Local Threshold deduction
Introduction
Document Image Binarization (DIB)
-segments the foreground text from
document background.
It is done in the pre-processing stage of document analysis.
It is the main component of Optical Character Recognition (OCR).
To Solve high inter and intra variation in documents
Handle different Documents with minimum parameter tuning
To tolerant text and background variation by
- Local image contrast
- Local image gradient
Existing System
- A Post-Processing is applied to improve the document
Binarization quality
- Disconnected text entries are connected via post processing process so binarized output is more trustable
Proposed System
To Propose a new method for segmenting text from damaged documents.
Degraded images are affected by Noisy data
- High peak to signal noise ratio.
- Bottle means Sequences error
Fuzzy Noise Detection should be done by to remove the noises.
The Noises should be removed before the binary system process.
Wavelet based image enhancement should be done before document
Binarization.
Document Binarization produces accurate Binarization compared with existing methods
Conclusion:
Simple and Robust, only few parameters are involved.
It works different kinds of degraded images.