18-04-2012, 01:35 PM
IMPROVING SOFTWARE SECURITY WITH PRECISE STATIC AND RUNTIME ANALYSIS
thesis.pdf (Size: 1.66 MB / Downloads: 43)
Introduction
The security of Web applications has become increasingly important in the last
decade. Web applications are rapidly becoming the norm for a wide range of software
development projects, as client-server solutions are getting less popular. More
and more Web-based enterprise applications deal with sensitive financial and medical
data, which, if compromised, can cause significant downtime and millions of dollars
in damages. It is crucial to protect these applications from hacker attacks.
Introduction
The current state of application security leaves much to be desired. The 2002 Computer
Crime and Security Survey conducted by the Computer Security Institute and
the FBI revealed that, on a yearly basis, over half of all databases experience at least
one security breach and an average episode results in close to $4 million in losses [37].
The survey also noted that Web crime has become commonplace. Web crimes range
from cyber-vandalism (e.g., Web site defacement) at the low end, to theft of sensitive
information and financial fraud at the high end.
A recent penetration testing study performed by the Imperva Application Defense
Center included more than 250 Web applications from e-commerce, online banking,
enterprise collaboration, and supply chain management sites [200]. Their vulnerability
assessment concluded that at least 92% of Web applications are vulnerable to
some form of hacker attacks. Security compliance of application vendors is especially
important in light of recent U.S. industry regulations such as the Sarbanes-Oxley act
pertaining to information security [20, 70].
According to the 2005 E-Crime Watch survey conducted in cooperation with the
United States Secret Service, 43% of respondents reported an increase in e-crimes
and intrusions over the previous year [138]. Overall 70% of respondents reported at
least one e-crime or intrusion was committed against their organization. During the
first six months of 2005, malicious code that exposed confidential information represented
74% of the top 50 malicious code samples, according to Symantec’s Internet
Security Threat Report Volume VIII [40]. The report also documents 1,872 vulnerabilities
in the first half of 2005, the most ever recorded since the inception of the
report. Despite this sampling of data pointing to the increasing threat of directed
attacks, the threat is likely still understated. Many directed attacks go unreported
for the following reasons:
• Many organizations try to suppress the fact that they were attacked in the hope
of avoiding negative publicity and damage to their reputation.
• Many organizations that have been attacked simply do not know that they have
been the victim of a targeted attack.
While a great deal of attention over the last decade has been given to networklevel
attacks such as port scanning, about 75% of all attacks against Web servers
target Web-based applications, according to a recent survey [89]. It is easy to underestimate
the potential level of risk associated with sensitive information within
databases accessed through Web applications until a severe security breach actually
occurs. Traditional defense strategies such as firewalls do not protect against Web
application attacks, as these attacks rely solely on HTTP traffic, which is usually
allowed to pass through firewalls unhindered. Thus, attackers typically have a direct
line to Web applications.
Many projects in the past focused on guarding against problems caused by
the unsafe nature of C, such as buffer overruns and format string vulnerabilities
[41, 177, 194]. However, in recent years, Java has emerged as the language
of choice for building large complex Web-based systems, in part because of language
safety features that disallow direct memory access and eliminate problems such as
buffer overruns. Platforms such as J2EE (Java 2 Enterprise Edition), Struts, Web-
Works, and Tapestry also helped to promote the adoption of Java as a language for
implementing e-commerce applications such as Web stores, banking sites, customer
information management sites, etc.
A typical Web application accepts input from the user browser and interacts with
a back-end database to serve user requests; J2EE libraries make these common tasks
easy to implement. However, despite Java language’s safety, it is possible to make
logical programming errors that lead to vulnerabilities such as SQL injections [6, 7, 59]
and cross-site scripting attacks [33, 87, 179]. Discovered several years ago, these attack
techniques are now commonly used to create exploits by malicious hackers. A score
of recently discovered vulnerabilities can be attributed to these attacks. A simple
programming mistake can leave a Web application vulnerable to unauthorized data
access, unauthorized updates or deletion of data, and application crashes leading to
denial-of-service attacks. Moreover, certain types of attacks may result in the attacker
gaining complete control over the underlying system.
The fact that many applications are deployed on external sites greatly increases
the perimeter of the attack. Many Web sites need to be made public in order to
give the necessary access to their customers, however, this also exposes them to
malicious hackers. A good example of this is a recent attack on a government Web
site ri.gov perpetrated by a college student living in Eastern Europe [170]. The
enabling mechanism for the attack was the presence of a SQL injection vulnerability,
which allowed the hacker to discover the structure of database tables and then to
execute a shell command through the underlying SQL server database. This incident
led to the theft of hundreds of credit card numbers.
.
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
: Number of vulnerabilities reported by year (based on NIST/DHS data).
Web Application Vulnerability Statistics
To further motivate our focus on Web application vulnerabilities in this thesis, this
section presents statistics that demonstrate how common various categories of vulnerabilities
are. While there are many cataloguing sites that collect vulnerability reports,
reliable statistics on the frequency of different vulnerability categories is hard to come
by. Below we report on the data obtained from some publicly available sources that
shed light on this matter.
NIST Study
The National Institute of Standards and Technology (NIST) and the Department of
Homeland Security have been aggregating vulnerability data for many years [156].
Statistics summarizing the total number of vulnerabilities in the NIST database are
presented in Figure 1.1. While the numbers for 2006 are not yet available at the time
Input Validation
Denial of Service
Other
File Include
Authentication Bypass
Temp. File Manipulation
Memory Corruption
Unauthorized Access
Privilege Escalation
Heap Overflow
Relative frequency of vulnerabilities in the SecurityFocus.com sample.
of this writing, the number of vulnerabilities is projected to be higher than in 2005.
As can be seen from Figure 1.1, there is a slight decline in the number of vulnerabilities
in 2003, followed by a sharp increase in 2004 and 2005. While it is difficult to
validate these claims precisely, one interpretation of this anomaly is that the decline
in 2003 is attributable to most of “shallow” buffer overruns being found. The sharp
increase in 2004 can be attributed to Web application vulnerabilities becoming commonplace.
SecurityFocus.com Study
To gain insight into the relative frequencies of different vulnerability types, we considered
a sample of 500 vulnerability reports obtained from the SecurityFocus.com
database. This vulnerability sample spans a week in November 2005. The format of
the vulnerability database allows for relatively easy processing and classification of
this data. A coarse classification of these vulnerabilities
is apparent from the picture that input and output validation vulnerabilities account
for over 50% of the sample.