23-04-2014, 02:31 PM
Efficient Processing of Uncertain Events in Rule-Based Systems
Efficient Processing.pdf (Size: 975.18 KB / Downloads: 26)
Abstract
There is a growing need for systems that react automatically to events. While some events are generated externally and
deliver data across distributed systems, others need to be derived by the system itself based on available information. Event derivation
is hampered by uncertainty attributed to causes such as unreliable data sources or the inability to determine with certainty whether an
event has actually occurred, given available information. Two main challenges exist when designing a solution for event derivation
under uncertainty. First, event derivation should scale under heavy loads of incoming events. Second, the associated probabilities
must be correctly captured and represented. We present a solution to both problems by introducing a novel generic and formal
mechanism and framework for managing event derivation under uncertainty. We also provide empirical evidence demonstrating the
scalability and accuracy of our approach.
INTRODUCTION
IN recent years, there has been a growing need for event-
driven (or active) systems, i.e., systems that react
automatically to events. The earliest event-driven systems
in the database realm [42] impacted both industry (triggers)
and academia (view materialization). New applications in
areas such as Business Process Management (BPM) [5];
sensor networks [11]; security applications (e.g., bio hazards
and computer security); engineering applications (e.g.,
forecasting networked resources availability); and scientific
applications (e.g., utilization of grid resources) all require
sophisticated mechanisms to manage and react to events.
Some events are generated externally and deliver data
across distributed systems, while other events and their
related data need to be derived by the system itself, based on
other events and some derivation mechanism. In many cases,
such derivation is carried out based on a set of rules (e.g., rules
in active databases [16] and special purpose event derivation
rule languages such as the Situation Manager Rule Language
[2]). Carrying out such event derivation is hampered by the
gap between the actual occurrences of events, to which the
system must respond, and the ability of event-driven systems
to accurately generate events.
ILLUSTRATIVE SCENARIO
In this section, we present a concrete scenario in the domain
of syndromic surveillance systems, which are designed to
detect epidemic outbreaks or bioterrorist attacks, to be used
throughout to illustrate our framework. Such systems use
data from external data sources (e.g., transactional data-
bases) and expert knowledge [35] to identify outbreaks and
quantify the outbreak attributes, such as the severity of an
attack. Responding quickly to such outbreaks or attacks
requires recognizing that such outbreaks or attacks have
occurred—a difficult task as no direct indications of it
exists. Therefore, events must be derived based on available
data sources, which often provide insufficient information
to determine hazard occurrence with certainty.
RELATED WORK
Complex event processing is supported by systems from
various domains. These include ODE [22], Snoop [10], and
others [31] for active databases and the Situation Manager
Rule Language [2], a general purpose event language. Event
management was also introduced in the area of business
process management [15], [3] and service-based systems
[14], [13], [24]. An excellent introductory book to complex
event processing is also available [30]. A recent book
introduces principles and applications of distributed event-
based systems [27]. Architectures for complex event
processing were proposed, both generic (e.g., [6], [40],
[21]) and by extending middleware (e.g., [33], [7]).
EMPIRICAL STUDY
We conducted an extensive empirical study to examine
both the accuracy and the scalability of probabilistic event
derivation using the sampling algorithm. Our results show
that the sampling algorithm has good accuracy and
scalability characteristics. With regards to accuracy, our
experiments yielded better accuracy then theoretically
expected. With regards to scalability, the algorithm displays
nice scalability characteristics as the number of possible
worlds increases. In addition, our performance results are
of the same order of magnitude as a high performance,
deterministic event system analyzed in [2]. Finally, we
observed that both the percentage of relevant events and the
desired precision are significant performance factors. While
the desired precision is determined by the specific applica-
tion and the accuracy required, the impact of the percentage
of relevant events once again emphasizes the importance of
the notion of selectability.
CONCLUSIONS AND FUTURE WORK
In this work, we presented an efficient mechanism for event
derivation under uncertainty. A model for representing
derived events was presented together with a Monte Carlo
sampling algorithm that approximates the derived event
probabilities. We experimented with the sampling algo-
rithm, showing it to be comparable to the performance of a
deterministic event composition system. It is scalable under
an increasing number of possible worlds (and uncertain
rules), while a Bayesian network algorithm for the same
purpose does not scale well, as it is exponential in the states
of the events as was described in Section 6.1. Finally, the
sampling algorithm provides an accurate estimation of
probabilities. Our contribution can be summarized as
follows: The introduction of a novel generic and formal
mechanism and framework for managing and deriving events
under uncertainty conditions.