12-09-2013, 12:18 PM
Text Super-Resolution and Deblurring using Multiple Support Vector Regression
Resolution and Deblurring.pdf (Size: 398.09 KB / Downloads: 192)
Introduction
In a world that is increasingly digital, there is a significant push to digitize documents of
cultural or intellectual value. A large portion of the data to be digitized is text, which poses
some challenges to massive digitization efforts. The problem has historically been solved
by expensive or complex equipment, such as the document scanners employed by Google or
through crowdsourcing the transcription of text that is too low-quality to be transcribed via
optical character recognition (OCR), e.g. reCAPTCHA. In order to make scanning large
quantities of text cost-effective, the price and complexity of the equipment would need to be
reduced significantly. The lower-quality equipment would produce lower-resolution images,
and thus, in order to have near-perfect transcription we would require robust methods to
super-resolve the low-resolution scans into something that could easily be interpreted by
OCR software.
Previous Work
Image super-resolution consists of producing a clearer version of a low-resolution image,
which is quite different from interpolation methods which aim to enlarge and often smooth
an image. Linear or even cubic spline interpolation create a softer image that may be good
for certain applications, but is horribly suited, for example, to enhance text images. It
is obvious that a local image is not enough to produce a higher-resolution version of the
image, and in general super resolution is an ill-posed problem. This is obvious since there
are many high-resolution images that could generate a low-resolution one after blurring or
1downsampling. Most machine learning approaches to super resolution involve a training set
of low- and high-resolution image pairs to train an algorithm to predict a clearer version of
an image. Additionally, the high correlation of neighboring pixels in natural images is often
utilized by image-enhancing algorithms. One of the most successful algorithms for image
super-resolution is belief propogation where an image’s high-resolution equivalent is treated
like a Markov network [9].
Methodology
Overview
Our algorithm for deblurring and super resolving an image is based on the one outlined in
[4]. Given a blurred, low-resolution image of text, our goal is to create a higher-resolution
image where the text has defined edges and is more comprehensible to both humans and
OCR software. More precisely, we have as input an image x ∈ Rn , and we wish to find
an estimate of a higher-resolution image, y ∈ Rm where m = 4r n, r ∈ N, that contains
clearly rendered text and whose blurred and downsampled self could, with high probability,
generate x.
In this project, we only deal with the case of creating an estimate of an image with 2x
the resolution of a low-resolution input image. We do this by using a N × N block of a low-
resolution image to generate an estimate of 4 pixels of a high-resolution image corresponding
to the center of the N × N low-resolution patch, i.e. m = 4n (Figure 1).
Parameter Search
A variety of parameters were used to train the SVR’s in an attempt to quantify their effect on
the resultant super-resolved images; each combination of parameters is represented in figure
5. In this section all our testing consisted of training on a single low- and high-resolution
image pair representing a ‘c’ whose low-resolution image had been blurred with a Gaussian
filter of variance 30 pixels applied to it. We then applied the four trained SVM’s to predict
super-resolved versions of low-resolution images of ‘c’, ‘o’ and ‘w’, all of which had the
same level of blurring as the ‘c’ we trained on. Tesing on ‘c’ logically results in very good
reconstruction. We chose ‘o’ as it is similar to ‘c’ and therfore we expected reasonably good
reconstruction, on par with what 5 shows. Finally, we chose ‘w’ because it does not resemble
‘c’ as it is all diagonal edges with no curves. As expected, predicting on ‘w’ produced the
worst results.
Individual Contributions
Sean wrote code to break down image pairs into a format suitable for training our SVM’s in
MATLAB, and he generated the data to compare the PSNR of predicted images given differ-
ent SVR parameters. Roy mainly developed and debugged the code for SVR in MATLAB.
Ross created the training set for the SVR. Roy and Ross wrote up the final report.