16-11-2012, 03:36 PM
Simple Probabilistic Predictions for Support Vector Regression
Simple Probabilistic Predictions.pdf (Size: 182.46 KB / Downloads: 39)
Abstract
Support vector regression (SVR) has been popular in the past decade, but it provides only an estimated target
value instead of predictive probability intervals. Many work have addressed this issue but sometimes the SVR formula
must be modified. This paper presents a rather simple and direct approach to construct such intervals. We assume that
the conditional distribution of the target value depends on its input only through the predicted value, and propose to
model this distribution by simple functions. Experiments show that the proposed approach gives predictive intervals
with competitive coverages with Bayesian SVR methods.
INTRODUCTION
In the past decade support vector regression (SVR) [15], [12] has been popular for regression problems. SVR
provides only an estimated target value; however, the statement that the future value falls in an interval with a
specified probability is more informative. This paper aims to construct predictive intervals for the future values.
For conventional linear regression, the prediction interval has been well developed; for example, see [16] for
Gaussian noise case and [3], [14] for non-Gaussian case. SVR differs from conventional regression in that it maps
input data into a high dimensional reproducing kernel Hilbert space and uses an -insensitive loss function. As a
result, SVR has a sparse representation of solutions, and hence is relatively fast in training/testing. However, due to
these differences, the existing methods for constructing prediction intervals can not be applied. Recently Bayesian
interpretations of SVR have been developed [6], [4], [2] along the ways of Bayesian techniques for Neural Networks
[8] and for SVM classification [13], [11]. Using a Bayesian framework, one can determine parameters in SVR by
maximizing an evidence function, and at the same time derive an error bar for prediction.
CONCLUSIONS
In this paper, we propose a simple approach for probabilistic prediction suitable for the standard SVR. Our
approach starts with generating out-of-sample residuals by cross validation, and then fits the residuals by simple
parametric models like Gaussian and Laplace. The most powerful scale-invariant test is applied to effectively test
Gaussian against Laplace. We then compare it with the Bayesian SVR methods by evaluating the performance of
the prediction intervals. The experiments on real-world problems show that our easy approach works fairly well and
is robust to parameter selection strategies. Moreover, in certain cases we can further improve upon our approach
by re-estimate the scale parameter of the Laplace family. In summary, though we assume that the distribution of
the target value depends on its input only through the predicted value, the proposed approach easily provides some
useful probability information for SVR analysis.