13-06-2012, 04:46 PM
Short-term prediction models for server management in Internet-based contexts
Short-term prediction models for server management in Internet.pdf (Size: 1.53 MB / Downloads: 29)
Introduction
Runtime algorithms that have to manage complex Internet-based
infrastructures need adequate supports to their decisions. The
common tendency of denoting the statistical properties of the external
workload reaching Internet-based systems (e.g., heavy-tailed
distributions [3,4,9], burst arrivals [19], hot spots [5]) does not help
the algorithms to deal with the complexity and mostly unknown
statistics of the system internals. Indeed, the models that are oriented
to evaluate system performance through a prevalent external traffic
view are useless to estimate and to anticipate a precise state of an
internal resource. Consequently, adequate runtime decisions require
the possibility of capturing the state of modern Internet applications
in terms of internal system scenarios consisting of numerous I/O
streams, timing information, and interactive concurrent tasks that
enter and leave the system in a way that is difficult to predict.
Related work
There are several external and internal system factors which make
the problem of supporting runtime management decisions in
Internet-based contexts interesting, difficult and not deeply investigated.
While external workload factors are well known, internal
system factors have received less attention. The complexity of the
middleware and applications for the support of Internet applications
is continuously increasing, but no adequate characterization of the
internal behavior of these systems has been presented yet. In other
parallel and distributed applications [1,5,10], the resource measures
are valid sources to decide about where the system is, where the
system is going, whether it is necessary to activate some management
process. While a measure offers an instantaneous view of the load
conditions of a resource, in Internet-based contexts, it is of little help
for distinguishing overload conditions from transient peaks, for
understanding load trends and for anticipating future conditions.
Statistical analysis
Management and coordination of Internet-based servers are
usually carried out through several algorithms that take on-the-fly
decisions on the basis of continuous information related to the state of
the internal system components. These management algorithms
could largely benefit of evaluations and short-term predictions
about the state of internal resources. Unfortunately, it is very difficult
to model the complexity of internal server interactions without
penalizing some realistic features because of two main causes [2]: the
typical Internet workload shows unstable patterns, heavy-tailed
distributions and flash crowds; there are unclear (from a modeling
point of view) relationships among the external arrivals and the
internal software/hardware components that are characterized by
object-oriented distributed software, concurrent accesses to application
and database servers, authentication, virtual servers, and
unpredictable mutual dependencies.
Moving filter models operating at runtime
Various filters that canwork at runtime are based onmoving average
models (e.g., EWMA [14,21]). Unfortunately, they are inadequate to
facilitate short-time predictions in an Internet-based context because
they tend to introduce an excessive representation delay when the size
of the raw data set is large, while they do not eliminate all noises when
the data set is small. The issue of the choice of the best data set size can
be addressed when the data values are characterized by some stability,
but this is not the case for the considered scenarios.Hence,we represent
the system resource behavior at runtime through a new moving filter
that is based on the Discrete Fourier Transform function (DFT) [24]. The
Discrete Fourier Transform applied at step j to the n values of the
raw data set Dj={dj−(n−1),…, dj} is a sequence of complex numbers
DFTj(w/n):
Adaptive prediction model
The data sets deriving from the server resources, although filtered,
are characterized by non-stationary effects. In this context, the
considered prediction models satisfy the computational constraints
but they may be affected by some inaccuracy. In the specific context of
the internal resources of Internet-based systems and runtime
prediction in a short-term horizon, we think it is necessary to propose
a novel class of prediction algorithms that is able to limit the
drawbacks of the existing models. An ideal prediction model should
combine the simplicity of the LR model, the AR and ARIMA qualities of
reproducing the stochastic pattern of the data set and the EWMA
ability of smoothing some noise components.