19-08-2014, 11:50 AM
ANALYSIS AND DETECTION OF METAMORPHIC
COMPUTER VIRUSES
ANALYSIS AND DETECTION.pdf (Size: 931.71 KB / Downloads: 23)
Abstract
Computer virus writers commonly use metamorphic techniques to produce viruses that
change their internal structure on each infection. It is generally believed that these
metamorphic viruses are extremely difficult to detect. Metamorphic virus generating kits
are readily available, so that little knowledge or skill is required to create these
potentially devastating viruses
In this project, we first analyze four virus creation kits to determine the degree of
metamorphism provided by each. We are able to precisely quantify the degree of
metamorphism produced by these virus generators. While the best generator, the Next
Generation Virus Creation Kit (NGVCK), produces virus variants that differ greatly from
one another, the other three generators we examined are much less effective.
INTRODUCTION
“A computer virus is a program that recursively and explicitly copies a possibly evolved
version of itself” [19]. A virus copies itself to a host file or system area. Once it gets
control, it multiplies itself to form newer generations. A virus may carry out damaging
activities on the host machine such as corrupting or erasing files, overwriting the whole
hard disk, or crashing the computer. Some viruses may print text on the screen or simply
do nothing. These viruses remain harmless but keep reproducing themselves. In any case,
viruses are undesirable for computer user
Over the past two decades, the number of viruses has been increasing rapidly. We have
seen several attacks that caused great disruption to the Internet and brought huge damage
to organizations and individuals. For example, in 1999, the infamous Melissa virus
infected thousands of computers and caused damage close to $80 million; while the Code
Red worm outbreak in 2001 affected systems running Windows NT and Windows 2000
server and caused damage in excess of $2 billion [23]. Computer virus attacks will
continue to pose a serious security threat to every computer user.
Encrypted Viruses
The simplest way to change the appearance of a virus is to use encryption. An encrypted
virus consists of a small decrypting module (a decryptor) and an encrypted virus body. If
a different encryption key is used for each infection, the encrypted virus body will look
different. Typically, the encryption method is rather simple, such as xor of the key with
each byte of the virus body. Simple xor is very practical because xoring the encrypted
code with the key again will give the original code and so a virus can use the same
routine for both encryption and decryption
FUTURE WORK
We trained our models on disassembled virus executables. The disassembling process can
take some time and the results depend on the quality of the disassembler. To speed up
virus pre-processing and to eliminate the reliance on a particular disassembler, we could
attempt to train the HMMs directly on the binary code of the viruses. Other machine
learning techniques, such as data mining or neural networks, might also work directly on
the binaries.
Training on raw executable byte sequences is more challenging as these byte sequences
are longer and contain more irrelevant parts. We can train the models using only the code
segments and perhaps the data segments, excluding header and other kinds of
identification information, since the behavior of a program is primarily determined by its
code segments.
To more thoroughly evaluate the performance of the HMM approach, it would be useful
to test on a larger set of virus variants and also test on different types of viruses. Ideally,
we would like to find viruses that are similar to normal programs to a degree that the
similarity index alone cannot distinguish the viruses from normal code. Only with such
data can we evaluate the effectiveness of the HMM approach to detecting metamorphic
viruses. However, it appears that no metamorphic kit available today is capable of
producing such challenging viral code.