21-04-2014, 12:37 PM
ESTIMATING THE SQUARE ROOT OF A DENSITY VIA COMPACTLY SUPPORTED WAVELETS
THE SQUARE ROOT OF A DENSITY.pdf (Size: 287.45 KB / Downloads: 38)
INTRODUCTION
A large body of nonparametric statistical literature is devoted to density estimation.
Overviews are given in Silverman (1986) and Izenman (1991). This paper addresses
the problem of univariate density estimation in a novel way. Our approach falls in the
class of so called projection estimators, introduced by Cencov (1962). The orthonor-
mal basis used is a basis of compactly supported wavelets from Daubechies' family.
Kerkyacharian and Picard (1992, 1993), Donoho et al. (1996), and Delyon and Judit-
sky (1993), among others, applied wavelets in density estimation. The local nature of
wavelet functions makes the wavelet estimator superior to projection estimators that
use classical orthonormal bases (Fourier, Hermite, etc.)
Instead of estimating the unknown density directly, we estimate the square root of
the density, which enables us to control the positiveness and the L1 - norm of the den-
sity estimate. However, in that approach one needs a pre-estimator of the density to
calculate sample wavelet coe cients. We describe VISUSTOP, a data-driven proce-
dure for determining the maximum number of levels in the wavelet density estimator.
Wavelet Density Estimators
We illustrate the wavelet density estimators on an astronomy example. Roeder (1990) pro-
vides the data set and uses it to exemplify her theoretical results. Here is a brief description
of the data set.
According to the Big Bang theory, matter in the universe expanded at a tremendous
rate. Gravitational forces caused the formation of galaxies. Astronomers speculate that
gravitational pull led to clustering of galaxies and research indicates the presence of super-
clusters of galaxies surrounded by large voids (string-and- lament pattern). Measurements
have recently become available for the distances between our galaxy and others. Distance
is estimated by the red shift in the light spectrum in a fashion similar to how the Doppler
e ect measures the changes in speed via changes in sound. Under the expansion-universe
paradigm, the furthest (from our galaxy) galaxies must be moving at greater velocities,
because the distances and velocities are proportional. If, in reality, the galaxies are clumped,
the velocities should have a multimodal distribution, each mode corresponding to a cluster.
In the region of Corona Borealis the velocities of 82 galaxies were measured. The relative
measurement error is believed to be smaller than 0.5%. The data (galaxy) are given in the
table below.
Conclusion
In the paper we developed and implemented a wavelet based algorithm for univariate density
estimation. Because we estimated the square root of density rst, we were able to guarantee
that the estimate integrates to 1 and is nonnegative. The estimator is adaptive in locality
and smoothness. An e ective method for selection of number of levels based on the empirical
scalogram of energies in the wavelet domain was developed.
A modi cation of Daubechies-Lagarias algorithm was proposed for fast and precise evalu-
ating of function jk at the point. The software S-Plus provided an excellent computational
environment for the practical implementation of the method.