Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: Spatio-Temporal Network Anomaly Detection by Assessing Deviations of Empirical Measur
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Spatio-Temporal Network Anomaly Detection by Assessing Deviations of Empirical Measures

Abstract:

We introduce Internet traffic anomaly detection mechanism based on large deviations results for empirical measures. Using past traffic traces we characterize network traffic during various time-of-day intervals, assuming that it is anomaly-free. Throughout, we compare the two approaches presenting their advantages and disadvantages to identify and classify temporal network anomalies. We also demonstrate how our framework can be used to monitor traffic from multiple network elements in order to identify both spatial and temporal anomalies. We validate our techniques by analyzing real traffic traces with time-stamped anomalies.



Algorithm / Technique used:

Anomaly Detection Mechanism.

Algorithm Description:

Anomaly detection and in particular on statistical anomaly detection, where statistical methods are used to assess deviations from normal operation. Our main contribution is the introduction of a new statistical traffic anomaly detection framework that relies on identifying deviations of the empirical measure of some underlying stochastic process characterizing system behavior .


Existing System:

Although significant progress has been made in network monitoring instrumentation, automated on-line traffic anomaly detection is still a missing component of modern network security and traffic engineering mechanisms. Network anomaly detection approaches can be broadly grouped into two classes: signature-based anomaly detection where known patterns of past anomalies are used to identify ongoing anomalies and anomaly detection which identifies patterns that substantially deviate from normal patterns of operation. Earlier work has showed that systems based on pattern matching had detection rates below 70%. Furthermore, such systems need constant (and expensive) updating to keep up with new attack signatures. As a result, more attention has to be drawn to methods for traffic anomaly detection since they can identify even novel (unseen) types of anomalies.






Proposed System:

We present two different approaches to characterize traffic: (I) a model-free approach based on the method of types and Sanovâ„¢s theorem, and (ii) a model-based approach modeling traffic using a Markov modulated process. Using these characterizations as a reference we continuously monitor traffic and employ large deviations and decision theory results to compare the empirical measure of the monitored traffic with the corresponding reference characterization, thus, identifying traffic anomalies in real-time. Our experimental results show that applying our methodology (even short-lived) anomalies are identified within a small number of observations.

Modules:

¢ Client Model
¢ Server Model
¢ Network Model
¢ Empirical Measures for Anomaly Detection
¢ Congestion Traffic Minimization
Module Description

¢ Client Model

A client is an application or system that accesses a remote service on another computer system, known as a server, by way of a network. The term was first applied to devices that were not capable of running their own stand-alone programs, but could interact with remote computers via a network. These dumb terminals were clients of the time-sharing mainframe computer

¢ Server model

In computing, a server is any combination of hardware or software designed to provide services to clients. When used alone, the term typically refers to a computer which may be running a server operating system, but is commonly used to refer to any software or dedicated hardware capable of providing services.

¢ Network Model

Generally, the channel quality is time-varying. For the ser-AP association decision, a user performs multiple samplings of the channel quality, and only the signal attenuation that results from long-term channel condition changes are utilized our load model can accommodate various additive load definitions such as the number of users associated with an AP. It can also deal with the multiplicative user load contributions.


¢ Empirical Measures for Anomaly Detection

As was mentioned before, the size of the alphabet and the number of states of the MMP for the Abilene data set is small when only temporal information is considered. Thus, it is easy to monitor subnets of PoPs (of low dimensionality) by specifying the group of PoPs of interest and the role of each PoP (origin or destination). We present results for two case studies with different spatial characteristics. We apply our framework to: (a) flows that originate (end) from (at) PoPs that are 1-hop neighbors and (b) flows that originate (end) from (at) PoPs that are many hops away from each other. In the first case study, the flows originate (end) at the Sunny Valley (SNVA) PoP with destination (originating from) the PoPs in its vicinity. We illustrate instances of the identification of anomalies applying the model-free and the model based methods, respectively. The values of the parameters for the two methods are obtained from the temporal anomaly detection examples. Table II reports the detection and false alarm rates we achieved. It is worth noticing that the detection rate reached 100% and the false alarms rate was very low (lower than the values when only temporal anomalies were studied). This is due to two main reasons: (a) instantaneous high values in the time-series of observations that do not necessarily indicate attacks are smoothed due to time averaging, and (b) attacks may have temporal and/or spatial correlation.

¢ Congestion Traffic Minimization

We provided two different approaches, a model-free and a model-based one. The model-free method works on a longer time-scale processing traces of traffic aggregates over a small time interval. Using an anomaly-free trace it derives an associated probability law. Then it processes current traffic and quantifies whether it conforms to this probability law. The model-based method constructs a Markov modulated model of anomaly-free traffic measurements and relies on large deviations asymptotics and decision theory results to compare this model to ongoing traffic activity. We presented a rigorous framework to identify traffic anomalies providing asymptotic thresholds for anomaly detection. In our experimental results the model-free approach showed a somewhat better performance than the model-based one. This may be due to the fact that the former gains from the aggregation over a time-bucket in addition to the fact that the latter one requires the estimation of more parameters, hence, it may introduce a larger modeling error. For future work, it would be interesting to analyze the robustness of the anomaly detection mechanism to various model parameters.
Since we monitor the detailed distributional characteristics of traffic and do not rely on the mean or the first few moments we are confident that our approach can be successful against new types of (emerging) temporal and spatial anomalies.
Our method is of low implementation complexity (only an additional counter is required), and is based on first principles, so it would be interesting to investigate how it can be embedded on routers or other network devices.



Hardware Requirements:


¢ System : Pentium IV 2.4 GHz.
¢ Hard Disk : 40 GB.
¢ Floppy Drive : 1.44 Mb.
¢ Monitor : 15 Vga Colour.
¢ Mouse : Logitech.
¢ RAM : 256 MB.


Software Requirements:

¢ Operating system : - Windows XP Professional.
¢ Front End : - JAVA
Using past traces of traffic, we characterize network traffic over several time-of-day intervals, assuming that it is free of anomalies. We present two different approaches to characterize traffic: (i) a model-free approach based on the type method and the Sanov theorem, and (ii) a model based on the traffic modeling model using a modulated Markov process . Using these characterizations as a reference, we continuously monitor traffic and use large deviations and results from decision theory to measure the empirical measure of the traffic monitored with the corresponding reference characterization, thus identifying real-time traffic anomalies. Our experimental results show that the application of our methodology (even short-lived) anomalies are identified in a small number of observations.

The need for a space-time network naturally arises when it comes to problems such as voice recognition and time series prediction where the input signal has an explicit temporal aspect. We have demonstrated that certain tasks that do not have an explicit temporal aspect can also be processed advantageously with neural networks capable of handling temporal information.