22-11-2012, 04:37 PM
Improving network security using genetic algorithm approach
Improving network security using genetic.pdf (Size: 554.66 KB / Downloads: 41)
Abstract
With the expansion of Internet and its importance, the types and number of the attacks have also grown making intrusion
detection an increasingly important technique. In this work we have realized a misuse detection system based on
genetic algorithm (GA) approach. For evolving and testing new rules for intrusion detection the KDD99Cup training
and testing dataset were used. To be able to process network data in real time, we have deployed principal component
analysis (PCA) to extract the most important features of the data. In that way we were able to keep the high level of detection
rates of attacks while speeding up the processing of the data.
2007 Published by Elsevier Ltd.
Introduction
Internet and local area networks are expanding at an amazing rate in recent years, not just in the terms of
size, but also in the terms of changing the services offered and the mobility of users that make them more vulnerable
to various kinds of complex attacks. While we are benefiting from the convenience that new technology
has brought us, computer systems are exposed to increasing number and complexity of security threats. Of
particular importance, thus, is the ability of applying rapidly new network security policies in order to detect
and react as quickly as possible to the occurring attacks. Different techniques have been developed and
deployed to protect computer systems against network attacks (anti-virus software, firewall, message encryption,
secured network protocols, password protection). Despite all the efforts, it is impossible to have a completely
secured system. Therefore, intrusion detection is becoming an increasingly important technique that
monitors network traffic and identifies network intrusions such as anomalous network behaviors, unauthorized
network access, or malicious attacks to computer systems. Most of the existing solutions are developed
for well-defined networks and systems [1–3]. Nevertheless, they are not adapted to dynamic environments, or
to the increasing complexity of user behaviors.
Survey on the machine learning techniques used for intrusion detection
Large amount of network data and big number of network attacks have imposed the usage of intelligent
machine learning techniques in order to discover attacks and their way of functioning. Past few years have
witnessed a growing recognition of intelligent techniques for the construction of efficient and reliable intrusion
detection systems. Most of the well-known pattern recognition techniques, both supervised and unsupervised,
and their combinations resulting in meta-classifiers have been used for intrusion detection. Some of the techniques
used in the state-of-the-art [15,17–24] and their results performed over KDD99Cup dataset are presented
in Table 1.
Genetic algorithm is one of the techniques that have recently been recognized as having potential in the
intrusion detection field. Some of its applications are presented in [7,8,15,18]. Novelty of our approach consists
in the fact that we have used only tree features out of 41 in order to describe network connection while
maintaining high detection rates, thus providing to the system the ability to perform intrusion detection process
rapidly, in the terms of both training and testing the rules for detection of intrusions, and the possibility of
application to the high speed networks. Our approach exhibits similar detection and false-positive rate as the
approaches presented in [7,8], but at the same time exhibits much shorter process of training and thus refreshing
the rule set. Frequent refreshing of the rule set is very important characteristic considering the rate of the
emerging of new attacks [25].
Genetic algorithm overview
Genetic algorithms (GA) are search algorithms based on the principles of natural selection and genetics.
The bases of genetic algorithm approach are given by Holland [13] and it has been deployed to solve wide
range of problems.
GA evolves a population of initial individuals to a population of high quality individuals, where each individual
represents a solution of the problem to be solved. Each individual is called chromosome, and is composed
of a predetermined number of genes [14]. The quality of each rule is measured by a fitness function as
the quantitative representation of each rule’s adaptation to a certain environment. The procedure starts from
an initial population of randomly generated individuals. Then the population is evolved for a number of generations
while gradually improving the qualities of the individuals in the sense of increasing the fitness value as
the measure of quality. During each generation, three basic genetic operators are sequentially applied to each
individual with certain probabilities, i.e. selection, crossover and mutation.