29-11-2012, 04:21 PM
Machine Learning Approaches to Network Anomaly Detection
1Machine Learning.pdf (Size: 187.56 KB / Downloads: 23)
Abstract
Networks of various kinds often experience anomalous
behaviour. Examples include attacks or large data transfers
in IP networks, presence of intruders in distributed video
surveillance systems, and an automobile accident or an untimely
congestion in a road network. Machine learning techniques
enable the development of anomaly detection algorithms that
are non-parametric, adaptive to changes in the characteristics
of normal behaviour in the relevant network, and portable
across applications. In this paper we use two different datasets,
pictures of a highway in Quebec taken by a network of webcams
and IP traffic statistics from the Abilene network, as examples
in demonstrating the applicability of two machine learning
algorithms to network anomaly detection. We investigate the
use of the block-based One-Class Neighbour Machine and the
recursive Kernel-based Online Anomaly Detection algorithms.
INTRODUCTION
A network anomaly is a sudden and short-lived deviation
from the normal operation of the network. Some anomalies are
deliberately caused by intruders with malicious intent such
as a denial-of-service attack in an IP network, while others
may be purely an accident such as an overpass falling in a
busy road network. Quick detection is needed to initiate a
timely response, such as deploying an ambulance after a road
accident, or raising an alarm if a surveillance network detects
an intruder.
Network monitoring devices collect data at high rates.
Designing an effective anomaly detection system consequently
involves extracting relevant information from a voluminous
amount of noisy, high-dimensional data. It is also important
to design distributed algorithms as networks operate under
bandwidth and power constraints and communication costs
must be minimised.
Different anomalies exhibit themselves in network statistics
in different manners, so developing general models of normal
network behaviour and of anomalies is difficult. Model-based
algorithms are also not portable across applications, and even
subtle changes in the nature of network traffic or the monitored
physical phenomena can render the model inappropriate. Nonparametric,
learning algorithms based on machine learning
principles are therefore desirable as they can learn the nature
of normal measurements and autonomously adapt to variations
in the structure of “normality”.
DATA
We use two different datasets to advocate the applicability
of machine learning algorithms to network anomaly detection.
1) Transports Quebec dataset: Transports Quebec maintains
a set of webcams over its major roads [18]. These
cameras record still images every five minutes. We collected
images recorded by 6 cameras over a period of four days (Sep.
30 to Oct. 03, 2006) on Quebec’s Autoroute 20. Each 5-minute
interval constitutes a timestep.
Anomaly detection in a sequence of images relies mainly on
the extraction of appropriate information from the sequence.
There are two fundamental reasons for this. First, the large
dimensionality inherent to image processing leads to dramatic
increase in implementation costs. Second, large variation in
operating conditions such as brightness and contrast (which
are subject to the time of day and weather conditions) and
colour content in the images (which is subject to season), can
cause abrupt and undesirable changes in the raw data.
We decided to use the discrete wavelet transform (DWT)
to process the images. The DWT is known for its ability to
extract spatially localised frequency information. We perform
the two-dimensional DWT on every image and average the
energy of transformation coefficients within each subband to
achieve approximate shift invariance of the feature extractor.
We expect that the appearance of a novel image in the
sequence will manifest itself as a sudden change in the power
in the frequency content of the vector of subband intensities.
At each timestep, we construct a wavelet feature vector from
each image obtained by each camera node.
Abilene Network Data Analysis
In this subsection we present the results of applying OCNM
and KOAD to the Abilene dataset. Here we want to also
detect those anomalies that cause sudden changes in the
overall distribution of traffic over the network, as opposed
to affecting a single link, during a particular timestep. Thus
in this application we implement the centralised architecture
proposed in Section IV-C. For discussions on the wide range of
anomalies seen in IP networks, refer to the works of Lakhina
et al. [4]–[6]. Here we also compare our results with those
obtained by Lakhina et al. using the PCA subspace method of
anomaly detection.
CONCLUSION
Our preliminary results of the application of machine
learning techniques to network anomaly detection indicate
their potential and highlight the areas where improvement is
required. The non-stationary nature of the network measurements,
be they network traffic metrics or recordings of physical
phenomena, makes it imperative that the algorithms be able to
adapt over time. To make the algorithms portable to different
applications and robust to diverse operating environments,
all parameters must be learned and autonomously set from
arriving data. The algorithms must be capable of running
in real-time despite being presented with large volumes of
high-dimensional, noisy, distributed data. This means that the
methods must perform sequential (recursive) calculations with
the complexity at each timestep being independent of time.
The computations must be distributed amongst the nodes in
the network, and communication, which consumes network
resources and introduces undesirable latency in detection, must
be minimised