22-02-2013, 09:52 AM
Dynamic Web Application Analysis for Cross Site Scripting Detection
Dynamic Web Application.pdf (Size: 820.85 KB / Downloads: 51)
Abstract
Though cross site scripting (XSS) is essentially a server-side problem, in most cases users
are the one who suffer. Additionally, most Anti-XSS measures developed so far are requiring
either a major customization effort or modifications in the Web Application. This thesis
presents a general XSS detector able to automatically derive all required Web Application
specific knowledge. Data-mining techniques are employed to analyse Web Applications in a
script-focused way, which only necessitates access to unencrypted HTTP traffic. Wherever
this is given, the system thus can be used as a straightforwardly deployable, Anomaly-based
Intrusion Detection Sensor. This can help to find XSS vulnerabilities in very different application
environments, and in future, the detector may even be implemented in a browser to
form a pure client-side XSS protection.
Introduction
Pages of dynamic Web Applications often contain user-supplied parts. When insufficiently
filtered, malicious scripts can be injected along with these parts. Assuming, that these are
executed in a user’s browser, it is possible to exploit the trust relationship between user
and Webserver by, for instance, compromising authentication credentials. This type of attack
is called Cross Site Scripting (XSS) and, although it originates in a failure in the Web
Application, clearly jeopardizes it’s users.
It was discovered in 2001, and since then has become by far the most common Web Application
vulnerability [7]. A lot of research was conducted for ways to develop more secure Web
Applications and fix existing ones. All approaches, however, require modification of the application
or other complex and time-consuming efforts. Even though new applications nowadays
can use modern frameworks with integrated support for such methods, a huge number of old
applications on the Internet still has to be ported. Probably due to the situation of misplaced
incentives, many application providers are opting only to fix vulnerabilities they are notified
about, instead of investing in the security of their sites. At present, the XSS disclosure rate
is still on the rise.
In this thesis, two conceptually different approaches for detecting XSS attacks are proposed.
All necessary information is derived using data mining techniques on mere HTTP traffic, thus
not requiring any Web Application modifications and allowing a straightforward deployment.
Additionally, both methods can be applied outside the provider’s sphere of influence, as long
as access to unencrypted HTTP traffic is given.
World Wide Web
The World Wide Web (WWW) is a hypertext-system invented in 1989 by Tim Berners-Lee
at CERN. It is the most commonly used application on today’s Internet. A large number
of so-called Webservers are offering their clients (called Webbrowsers) documents of varying
types and contents. Every document is assigned one or more unique addresses (called URLs),
containing, amongst others, the name of a server, from which the document can be queried.
Most of this documents are Webpages written in HTML, which allows reference to other documents
using unidirectional hyperlinks. In this way, a world wide network of interconnected
documents is created, hence the name WWW. A number of Webpages, which are accessible
under the same domain and forming a coherent set of information, are referred to as ’Website’.
The WWW (or the ’Web’, as it is often called) is the technical foundation, on which both
Cross Site Scripting attacks (discussed in Section 2.2) and the concepts in Chapter 4 are based.
The information in this Section is crucial for understanding how vulnerable Webservers are
jeopardizing their users.
HTTP
The HyperText Transfer Protocol (HTTP) is used by browsers to query documents from the
servers and therefore constitutes the very foundation of the WWW. It is specifying a simple
request-response-scheme, offering a variety of methods for adaption to different purposes.
While the GET-method for querying documents certainly is the most commonly known and
used one, there is, for instance, also a POST-method for transmitting data to the server.
Independent of the method used, however, the client is sending a URL to the server that in
turn is answering with a 3-digit response-code and a document. Both request and response can
additionally contain any number of name-value pairs in their header, used for implementing
cookies as an example (see Subsection 2.1.4). If the request could be processed successfully,
the server answers with the response-code 100 - ’success’. Otherwise, the code is used to
give information about what type of error occurred. 404 means ’document not found’ and
500 stands for ’authentication failed’. Additionally, it is possible to forward the browser to
a different URL with the response-code 301 - ’moved permanently’. It should be stressed
though, that every response contains a document, independent of the response-code. In case
of errors, this is usually used to explain the error in human-readable form.
HTTP uses TCP-connections to transmit its data. However, unlike TCP, it is completely
stateless. The connection is always closed after transmitting one or more request-responsepairs.
Even later, when the introduction of HTTPS demanded the reuse of SSL-connections
for efficiency, the statelessness was preserved for compatibility reasons.
HTML
The HyperText Markup Language (HTML) is the language used by Webpages in the WWW.
It has been developed by the World Wide Web Consortium (W3C) until version 4.01 and is
rendered by all popular browsers, more or less conforming to the standard. HTML documents
are consisting of text, so-called tags and their attributes. Like XML, they are exhibiting a
tree-like structure and, besides pure text, can contain markup elements (bold, italic, etc.),
hyperlinks, as well as meta-information (language, author). Other documents, like images,
animations, music or other Webpages can be included by their URLs. When a browser is
displaying such a Webpage, it has to send out a number of HTTP-requests probably to a
number of different servers.
Dynamic Websites
In contrast to static Webpages, dynamic ones are able to change over time. After receiving
the request, their contents is generated by a program and therefore can change depending
on a number of factors. It thus can happen, that a browser receives two different documents
when querying the same URL twice. For this reason, dynamic Webpages are marked with
special tags refraining the browser from caching them.
Authentication & Sessions
For allowing access restrictions, a feature called HTTP-Authentication was already introduced
in HTTP 0.9: If a browser requests a restricted document, the server responds with responsecode
401 - ’unauthorised’ and the information as to which ’realm’ the document belongs
to. Every document within this realm can only be accessed with the appropriate credentials
(which the browser requests from his user) being included in the requests header.
For allowing restricted realms to be used conveniently, sessions were implemented. Once the
credentials have been requested from the user, the browser stores and automatically includes
them into every response for a document of the restricted realm. After a first successful
authentication (log in) the user therefore does not need to bother about the restriction any
more. Mechanisms working like that are referred as ’implicit authentication’.
Most modern Web Application are (mis)using a different mechanism to authenticate their
users and provide authenticated sessions, though: so-called HTTP-cookies [13]. A cookie
is a small chunk of data (max. 4096 bytes according to the HTTP1.1 specification [15]),
included in the header of some HTTP-response (’Set-Cookie’-field) and assigned a domain
as well as an expiration date. Cookies are - if permitted by the user - stored in the browser
and, very much like the authentication credentials, automatically included in each request
(’Cookie’-field) furthermore sent by the browser within the validity period to a server of the
same domain (or a subdomain).
Active Content
For allowing Web Applications to execute parts of their logic on the client-side, a number of
active content technologies were introduced over time. In classical client-server applications,
client-side logic has already been used for several years to improve the application’s responsiveness.
On the Web, JavaScript, Java-applets, Flash or similar technologies are also mainly
used to shorten response times and interact more closely with the user. Lately, however, an
increasing number of so-called ’Asynchronous JavaScript And XML (AJAX)’-applications are
appearing, which are moving their entire application logic to the client-side and use servers
as mere data storages or in the form of Web Services.
Common for all types of active content is that it has to be integrated into HTML-code
somehow. Binary formats like Macromedia Flash or the bytecode of Java-applets are usually
stored as separate documents on the server and integrated by giving their URL. The browser
then queries them along with the Webpage and passes them to the appropriate plugin for
execution.
JavaScript
The JavaScript language was invented for the Netscape Navigator by Brendan Eich in 1995.
Since then, it has been standardized as ECMAScript and is supported by all popular browsers.
JavaScript is the first - and nowadays the most popular - client-side language of the WWW.
It can be integrated into HTML-pages in a variety of ways1:
First, it is obviously possible to include external JavaScript. An empty script-tag is added
to the document and the URL is specified in it’s ’src’-attribute. Like in the case of Flash or
Java-applets, the document from that URL is queried and it’s content executed.
The second possibility is the so-called inlining. The code is enclosed by script-tags and
directly inserted into the document. In both ways, the browser stops the parsing process
when encountering the first script-tag, executes the code and inserts everything that was
’written’ using the command document.write() directly after the closing script-tag. After
the script’s termination, parsing continues right there with what has been inserted.
Additionally, most HTML-tags have been given so-called ’events’. JavaScript is able to
handle these when a so-called event-handler is specified. If a tag is given an attribute with
the event’s name (’onclick’, ’onmouseover’, ’onload’, etc.) and the code is specified as it’s
value, it will be executed every time the event occurs (the element is clicked, touched by the
mouse cursor, finishes loading, etc).