28-05-2013, 04:20 PM
Software reverse engineering education
Software reverse.pdf (Size: 1.45 MB / Downloads: 58)
Introduction
From very early on in life we engage in constant investigation of existing things
to understand how and even why they work. The practice of Software Reverse
Engineering (SRE) calls upon this investigative nature when one needs to learn how and
why, often in the absence of adequate documentation, an existing piece of software—
helpful or malicious—works. The sections that follow cover the most popular uses of
SRE and, to some degree, the importance of imparting knowledge of them to those who
write, test, and maintain software. More formally, SRE can be described as the practice
of analyzing a software system to create abstractions that identify the individual
components and their dependencies, and, if possible, the overall system architecture [1],
[2]. Once the components and design of an existing system have been recovered, it
becomes possible to repair and even enhance them.
Events in recent history have caused SRE to become a very active area of
research. In the early nineties, the Y2K problem spurred the need for the development of
tools that could read large amounts of source or binary code for the 2-digit year
vulnerability [2]. Shortly after the preparation for the Y2K problem, in the mid to late
nineties, the adoption of the Internet by businesses and organizations brought about the
need to understand in-house legacy systems so that the information held within them
could be made available on the Web [3]. The desire for businesses to expand to the
Internet for what was promised to be limitless potential for new revenue caused the
creation of many Business to Consumer (B2C) web sites.
Reverse Engineering in Software Development
While a great deal of software that has been written is no longer in use, a
considerable amount has survived for decades and continues to run the global economy.
The reality of the situation is that 70% of the source code in the entire world is written in
COBOL [3]. One would be hard-pressed these days to obtain an expert education in
legacy programming languages like COBOL, PL/I, and FORTRAN. Compounding the
situation is the fact that a great deal of legacy code is poorly designed and documented
[3]. [6] states that "COBOL programs are in use globally in governmental and military
agencies, in commercial enterprises, and on operating systems such as IBM's z/OS®,
Microsoft's Windows®, and the POSIX families (Unix/Linux etc.). In 1997, the Gartner
Group reported that 80% of the world's business ran on COBOL with over 200 billion
lines of code in existence and with an estimated 5 billion lines of new code annually."
Since it's cost-prohibitive to rip and replace billions of lines of legacy code, the only
reasonable alternative has been to maintain and evolve the code, often with the help of
concepts found in software reverse engineering. Fig. 2.1 illustrates a process a software
engineer might follow when maintaining legacy software systems.
Reverse Engineering in Software Security
From the perspective of a software company, it is highly desirable that the
company's products are difficult to pirate and reverse engineer. Making software difficult
to reverse engineer seems to be in conflict with the idea of being able to recover the
software's design later on for maintenance and evolution. Therefore, software
manufacturers usually don't apply anti-reverse engineering transformations to software
binaries until it is packaged for shipment to customers. Software manufacturers will
typically only invest time in making software difficult to reverse engineer if there are
particularly interesting algorithms that make the product stand out from the competition.
Making software difficult to pirate or reverse engineer is often a moving target
and requires special skills and understanding on the part of the developer. Software
developers who are given the opportunity to practice anti-reversing techniques might be
in a better position to help their employer, or themselves, protect their intellectual
property. As [3] states, "to defeat a crook you have to think like one."
Control Flow Obfuscation for the Record Limit Check
We introduce some non-essential, recursive, and randomized logic to the
password limit check in PasswordVault.cpp to make it more difficult for a reverser to
perform static or live analysis. A design for obfuscated control flow logic which
ultimately implements the trial limitation check is given in Fig. 7.3. Since no standards
exist for control flow obfuscation, this algorithm was designed by the author using the
cyclomatic complexity metric defined by McCabe [24] as a general guideline for creating
a highly-complex control flow graph for the trial limitation check.
Conclusion
Unless something is done to include a required amount of reverse engineering
instruction in computer science and software engineering programs of study, new
engineers will remain ill-equipped to work with legacy software systems as well as be
unable to ensure that software is secure and safe to deploy. Most large companies have
existing software systems that have been the underpinning of their business for years. It's
highly difficult, not to mention cost-prohibitive, to rip and replace mission-critical
software systems in response to the emergence of a new technology. As a result,
organizations are always looking for candidates that can help them understand what they
have and how it can be evolved to interact with the latest technologies. Students and
practicing engineers need reverse engineering skills to be able to help organizations, both
large and small, understand their current technology stack and recommend an integration
strategy for new technologies.