27-10-2016, 10:48 AM
1461584044-report.docx (Size: 2.54 MB / Downloads: 5)
INTRODUCTION
1.1 THREE TIER ARCHITECTURE
Web Sphere Application Server provides the application logic layer in three tier architecture, enabling client components to interact with data resources and legacy applications. Collectively, three tier architectures are programming models that enable the distribution of application functionality across three independent systems. Client components runningon local workstations (tier one) Processes running on remote servers (tier two) A discrete collection of databases, resource managers, andmainframe applications (tier three) these tiers are logical tiers. Theymight or might not be running on the same physical server.
1.1.1 FIRST TIER
Responsibility for presentation and user interaction resides with thefirst tier components. These client components enable the user to interactwith the second tier processes in a secure and intuitive manner. Web SphereApplication Server supports several client types. Clients do not access thethird tier services directly. For example, a client component provides a formon which a customer orders products. The client component submits thisorder to the second-tier processes, which check the product databases andperform tasks that are needed for billing and shipping.
1.1.1 SECOND TIER
The second tier processes are commonly referred to as theapplication logic layer. These processes manage the business logic ofthe application, and are permitted access to the third tier services. Theapplication logic layer is where most of the processing work occurs.Multiple client components can access the second tier processessimultaneously so this application logic layer must manage its owntransactions.
1.1.2 THIRD TIER
The third tier services are protected from direct access by the clientcomponents residing within a secure network. Interaction must occurthrough the second tier processes.
1.2INTRODUCTION ABOUT THE SYSTEM
Web based attacks have recently become more diverse, asattention has shifted from attacking the front-end to exploitingvulnerabilities of the web applications in order to corrupt the back-enddatabase system (e.g., SQL injection attacks ). A plethora of IntrusionDetection Systems (IDS) currently examine network packets individuallywithin both the web server and the database system. However, there isvery little work being performed on multi-tiered. Anomaly Detection (AD)systems that generate models of network behavior for both web and databasenetwork interactions. In such multi-tiered architectures, the back-enddatabase server is often protected behind a firewall while the web servers areremotely accessible over the Internet. Unfortunately, though they areprotected from direct remote attacks, the back-end systems are susceptibleto attacks that use web requests as a means to exploit the back-end. Toprotect multi-tiered web services, Intrusion detection systems (IDS) havebeen widely used to detect known attacks by matching misused trafficpatterns or signatures. A class of IDS that leverages machine learning canalso detect unknown attacks by identifying abnormal network traffic thatdeviates from the so called normal behavior previously profiled during the IDS training phase.
Individually, the web IDS and the database IDS candetect abnormal network traffic sent to either of them. However found thatthese IDS cannot detect cases wherein normal traffic is used to attack theweb server and the database server. For example, if an attacker with nonadmin privileges can log in to a web server using normal-user accesscredentials, he/she can find a way to issue a privileged database query byexploiting vulnerabilities in the web server. Neither the web IDS nor thedatabase IDS would detect this type of attack since the web IDS wouldmerely see typical user login traffic and the database IDS would see only thenormal traffic of a privileged user. This type of attack can be readilydetected if the database IDS can identify that a privileged request from theweb server is not associated with user-privileged access. Unfortunately,within the current multi-threaded web server architecture, it is not feasibleto detect or profile such causal mapping between web server traffic and DB server traffic since traffic cannot be clearly attributed to user sessions[10].
1.2 DOUBLE GUARD DETECTION
This project presents Double Guard, a system used to detectattacks in multi-tiered web services. Our approach can create normalitymodels of isolated user sessions that include both the web front end (HTTP)and back end (File or SQL) network transactions. To achieve this, employ alightweight virtualization technique to assign each users web session to adedicated container, an isolated virtual computing environment. We use the container ID to accurately associate the web request with the subsequent DBqueries. Thus, Double Guard can build a causal mapping profile by takingboth the web server and DB traffic into account. We have implemented ourDouble Guard container architecture using Open VZ, and performance testingshows that it has reasonable performance overhead and is practical for mostweb applications .It also provides an isolation that prevents future session-hijacking attacks. Within a lightweight virtualization environment we ranmany copies of the web server instances in different containers so that eachone was isolated from the rest.
Containers can be easily instantiated anddestroyed[8], we assigned each client session a dedicated container so that, evenwhen an attacker may be able to compromise a single session, the damage isconfined to the compromised session other user sessions remain unaffected byit. Using our prototype we show that, for websites that do not permit contentmodification from users, there is a direct causal relationship between therequests received by the front end web server and those generated for thedatabase backend. In fact we show that this causality mapping model can begenerated accurately and without prior knowledge of web application functionality.
1.3 CONTAINERS AND LIGHT WEIGHT VIRTUALIZATION
Virtualization is the act of making a set of processes believe that ithas a dedicated system to itself. There are a number of approaches being takento the virtualization problem, with Xen, VMware, and User-mode Linux beingsome of the better known options. Those are relatively heavy weightsolutions, however, with a separate kernel being run for each virtual machine.Often, that is exactly the right solution to the problem; running independentkernels gives strong separation between environments and enables the runningof multiple operating systems on the same hardware. Full virtualization and para-virtualization are not the only approachesbeing taken, however. An alternative is lightweight virtualization, generallybased on some sort of container concept. With containers, a group ofprocesses still appears to have its own dedicated system, but it is reallyrunning in a specially isolated environment. All containers run on top of thesame kernel. With containers, the ability to run different operating systems islost, as is the strong separation between virtual systems.
Double guard might not want to give root access to processes running within acontainer environment. On the other hand, containers can have considerableperformance advantages, enabling large numbers of them to run on the samephysical host.
1.4 OBJECTIVE
In this project, we propose an efficient IDS system called as DoubleGuard system that models the network behavior for multilayered webapplications of user sessions across both front-end web(HTTP) requestsand back end database (SQL) queries.
SYSTEM ANALYSIS
2.1EXISTING SYSTEM
Intrusion detection system currently examines network packetsindividually within both the web server and the database system.Howeverthere is very little work being performed on multi-tiered AnomalyDetection system that generate models of network behavior for both weband database interactions. In such multi-tiered architectures, the back enddatabase server is often protected behind a firewall while the web serversare remotely accessible over the internet. Unfortunately, though they areprotected from direct remote attacks, the back-end systems are susceptibleto attacks that use web request as a means to exploit the back end. Web IDSwould merely see typical user login traffic and database IDS see normaltraffic of privileged user. It detects the intrusions or vulnerabilities bystatically analyzing the source code or executable.
2.1.1 CLASSIC 3 TIER MODEL
In the existing system the communication between the web serverand the database server is not separated, and hardly understand the relationships among them. In figure 2.1,if Client 2 is malicious and takes over the webserver, all subsequent database transactions become suspect, as well as theresponse to the client.
2.1.2 LIMITATION OF EXISTING SYSTEM
In the existing system, traffic in multicast is used by the hostile.Boththe web and the database servers are vulnerable.Attacks are come from theweb clients. They launch application layer attacks to compromise the webservers they connecting to. The attackers can bypass the web server todirectly attack the database server. Attackers may take over the web serverafter the attack, and that afterwards they can obtain full control of the webserver to launch subsequent attacks.Attackers could modify the applicationlogic of the web applications, eavesdrop or hijack other user’s web requests,or intercept and modify the database queries to steal sensitive databeyond their privileges.
2.2 PROPOSED SYSTEM
Double guard detection uses both front end and back enddetection methods.Some previous approaches have detected intrusions orvulnerabilities by statically analyzing the source code or executable, othersdynamically track the information flow to understand taint propagations anddetect intrusions. In double guard, the new container based web serverarchitecture enables us to separate the different information flows by eachsessions by using light weight virtualization. Within a light weightvirtualization environment we ran many copies of web server instances indifferent containers so that each one isolated from the rest, it separatedifferent information flow from the each session. This provides a means oftracking the information flow from the web server to the database server foreach session. It is possible to initialize the thousands of container on asingle machine. Double guard detectsSQL injection attacks by taking the structureof the web request and database queries without looking into the valuesof input parameter. In our Double Guard, we utilize the container ID toseparate session traffic as a way of extracting and identifying causalrelationship between web server request and database query event. Ourapproach dynamically generates new containers and recycles used ones.As a result a single physical server can run continuously and serve allweb requests. However, from a logical perspective, each session isassigned to a dedicated web server and isolated from other sessions.Since we initialize each virtualized container using a read only clean template, can guarantee that each session will be served with a cleanweb server instance at initialization[4]. This system choose to separate communications at thesession level so that a single user always deals with the same web server.
Sessions can represent different users to some extent, and we expect thecommunication of a single user to go to the same dedicated web server,thereby allowing us to identify suspect behavior by both session and user.If it detects abnormal behavior in a session, it will treat all traffic within thissession as tainted.
2.2.1 ADVANTAGES
The proposed system is well correlated model that provides aneffective mechanism to different type of attacks and also more accurate. Theproposed system will also create a causal mapping profile by taking both theweb server and DB traffic into account. It also provides a bettercharacterization for anomaly detection with the correlation of input streamsbecause the intrusion sensor has a more precise normality model that detectsa wider range of threats. Containers are easily assigned and destroyed. Eachclient session are assigned to separate container. If suppose attackercompromise a single session the damage is confined to the compromisedsession other session are unaffected. If suppose traffic occurred onlyparticular user session will affected and also easily find who is attacker.
2.3 FEASIBILITY STUDY
The main objective of this study is to determine whether the proposed system is feasible or not. Mainly there are three types of feasibility study to which the proposed system is subjected as described below. Three key considerations are involved in this feasibility. The proposed system must be evaluated from a technical viewpoint first, and if technically feasible, there impact on the organization must be assessed. If compatible, the operational system can be devised. Then those must be treated for economic feasibility.
2.3.1 Operational Feasibility
The proposed project is beneficial because the software is the first of its class, so the users are encouraged to use it, and is expected to serve the user’s needs on request. The user interface is designed in such a way that the users are not bound to have any doubts to use the interface.
2.3.2 Technical Feasibility
The assessment oftechnical feasibility must be based on an outline design of systemrequirements in terms of input, output, files, programs, procedures and staff. This canbequantified in terms of volume ofdata, trends, frequency of updating, etc. Havingidentifiedan outline system, the investigator must go on suggesting the type of equipment required, methods of developing the system, and method of running the system. With regard to processing facilities, the feasibility study will need to consider the possibility of using a bureau or ,if in-house equipment is available, the nature of the hardware to be used for data collection,storage,output and processing.Onthe systemdevelopment side, the feasibilitystudy must considerthe various ways of acquiring the system. These include the purchase of package, the use of consultancy organization or software house to design the system and writetheprograms.
The technology required for developing the system is identified. It has technical capability to initialize the system and perform data transfer. It also provides technical guarantee of assurance, reliability, easy access and security. Thus, since both software and hardware requirements are satisfied, it is technically feasible.
2.3.3 Economical Feasibility
Justification for any capital is that it will increase profit, reduce expenditure or improve the quality-increased profit. Proposed or developing system must be justified by cost benefit criteria that effort is concentrated on projects, which will give the best, return at the earliest opportunity.
The cost benefit analysis is often used as a basis for assessing economic feasibility.
It involves the following analysis:
• Cost of operation of the existing and proposed system.
• Cost of development of the proposed system.
• Value of the benefits of the proposed system.
The system is developed at reasonable cost with the available hardware, software and man power. So its benefits overweigh the cost. So it is economically feasible.
CHAPTER 3
METHODOLOGY
3.1 CREATE CONTAINER MODEL
All network traffic from both legitimate users and adversaries isreceived intermixed at the same web server. If an attacker compromises theweb server, he/she can potentially affect all future sessions (i.e., sessionhijacking). As signing each session to a dedicated web server is not arealistic option, as it will deplete the web server resources. To achievesimilar confinement while maintaining a low performance and resourceoverhead, we use lightweight virtualization. In our design, we make use oflightweight process containers referred to as containers as ephemeral,disposable servers for client sessions. It is possible to initialize thousands ofcontainers on a single physical machine, and these virtualized containers canbe discarded, reverted, or quickly reinitialized to serve new sessions. Asingle physical web server runs many containers, each one an exact copy ofthe original web server. Our approach dynamically generates new containers and recyclesused ones. As a result, a single physical server can run continuously andserve all web requests. However, from a logical perspective, each session isassigned to a dedicated web server and isolated from other sessions[2]. Since weinitialize each virtualized container using a read-only clean template, we canguarantee that each session will be served with a clean web server instance at initialization. We choose to separate communications at the session levelso that a single user always deals with the same web server. Sessions canrepresent different users to some extent, and we expect the communicationof a single user to go to the same dedicated web server, thereby allowing usto identify suspect behavior by both session and user. If we detect abnormalbehavior in a session, we will treat all traffic within this session as tainted.
If an attacker compromises a vanilla web server, other session’scommunications can also be hijacked. In our system, an attacker can onlystay within the web server containers that he/she is connected to, with noknowledge of the existence of other session communications. We can thusensure that legitimate sessions will not be compromised directly by anattacker.
3.2 BUILDING NORMALITY MODEL
Normality model depicts how communications are categorized assessions and how database transactions can be related to a correspondingsession. Client2 will only compromise the VE 2 and the correspondingdatabase transaction set T2 will be the only affected section of data within thedatabase as shown in figure 3.1. This container based and session separated web serverarchitecture not only enhances the security performances but also providesus with the isolated information flows that are separated in each containersession. It allows us to identify the mapping between the web server requestsand the subsequent DB queries, and to utilize such a mapping model to detectabnormal behaviors on a session/client level. In typical three-tiered webserver architecture, the web server receives HTTP requests from user clientsand then issues SQL queries to the database server to retrieve and update data. These SQL queries are causally dependent on the web request hittingthe web server[1]. Even if we knew the application logic of the web server and wereto build a correct model, it would be impossible to use such a model todetect attacks within huge amounts of concurrent real traffic unless we hada mechanism to identify the pair of the HTTP request and SQL queries thatare causally generated by the HTTP request. However, within our containerbased web servers, it is a straightforward matter to identify the causalpairs of web requests and resulting SQL queries in a given session. Moreover as traffic can easily be separated by session, it becomes possible forus to compare and analyze the request and queries across different sessions.
Double guard system put sensors at both sides of the servers. Atthe web server, our sensors are deployed on the host system and cannot beattacked directly since only the virtualized containers are exposed toattackers. Our sensors will not be attacked at the database server eitherassumes that the attacker cannot completely take control of the databaseserver. In fact, we assume that our sensors cannot be attacked and canalways capture correct traffic information at both ends. Once we build themapping model, it can be used to detect abnormal behaviors. If there existsany request or query that violates the normality model within a session,then the session will be treated as a possible of attack.
In double guard prototype, choose to assign each user session into adifferent container however this was a design decision. For instance canassign a new container per each new IP address of the client. In our prototype used 15 minutes time out due to resource constraints of our testserver.
SYSTEM DESIGN
4.1 MODULES
DOUBLE GUARD has seven main modules. They are as follows:
4.1.1 Login
In login module the user willlogin to the web server to start up their process. Username and password will be provided to every user through this username and password the user can login to the web server.
4.1.2 Connecting server
After login to the web server the user should make connection with the web server to get the information from the web server. For making connection with the web server every user have unique signature to denote that they are the authorized person to retrieve the data from the web server and database server. While connecting to the web server the signature of every user will be checked and the connection will be made when the signature is valid otherwise the connection will not be made.
4.1.3 Container generation
The container will be generated for each and every session in the web server. The container will provide session id for every session. The data and the information about the query processed are stored in the container.
4.1.4 Query Mapping
In this module the user query will be processed. The web server will check the query for authentication purpose after the query is authenticated the web server will process the query and retrieve the data from the database server and it is provided to the user by the web server. Due to their diverse functionality, different web applications exhibit different characteristics. Many websites serve only static content, which is updated and often managed by a Content Management System (CMS). For a static website, we can build an accurate model of the mapping relationships between web requests and database queries since the links are static and clicking on the same link always returns the same information. However, some websites (e.g., blogs, forums) allow regular users with non-administrative privileges to update the contents of the served data. This creates tremendous challenges for IDS system training because the HTTP requests can contain variables in the passed parameters.
For example, instead of one-to-one mapping, one web request to the web server usually invokes a number of SQL queries that can vary depending on type of the request and the state of the system[3]. Some requests will only retrieve data from the web server instead of invoking database queries, meaning that no queries will be generated by these web requests. In other cases, one request will invoke a number of database queries. Finally, in some cases, the web server will have some periodical tasks that trigger database queries without any web requests driving them. The challenge is to take all of these cases into account and build the normality model in such a way that we can cover all of them.
If several SQL queries are always found within one HTTP request then we can usually have an exact mapping. However, this is not always the case. Some requests will result in different queries based on the request parameters and the state of the web server. The probabilities for these queries are usually not the same. Since the request is at the origin of the data flow, we treat each request as the mapping source. In other words, the mappings in the model are always in the form of one request.
The mapping models are
1. Deterministic Mapping
2. Empty Query Set
3. No Matched Request
4. Non Deterministic Mapping
4.1.4.1 Deterministic Mapping
This is the most common and perfectly matched pattern. That is to say that web request rm appears in all traffic with the SQL queries set Qn. For any session in the testing phase with the request rm, the absence of a query set Qn matching the request indicates a possible intrusion. On the other hand, if Qn is present in the session traffic without the corresponding rm, this may also be the sign of an intrusion. This is shown in figure 4.1.In static websites, this type of mapping comprises the majority of cases since the same results should be returned for each time a user visits the same link.
2 Empty Query Set
In special cases, the SQL query set may be the empty set as shown in figure 4.2. This implies that the web request neither causes nor generates any database queries. For example, when a web request for retrieving an image GIF file from the same web server is made, a mapping relationship does not exist because only the web requests are observed. During the testing phase, we keep these web requests together in the set EQS.
No Matched Request
In some cases, the web server may periodically submit queries to the database server in order to conduct some scheduled tasks, such as cron jobs for archiving or backup as shown in figure 4.3. This is not driven by any web request, similar to the reverse case of the Empty Query Set mapping pattern. These queries cannot match up with any web requests, and we keep these unmatched queries in a set NMR. During the testing phase, any query within set NMR is considered legitimate. The size of NMR depends on web server logic,but it is typically small.
Non Deterministic Mapping
The same web request may result in different SQL query sets based on input parameters or the status of the webpage at the time the web request is received. In fact, these different SQL query sets do not appear randomly, and there exists a candidate pool of query sets. As shown in figure 4.4, each time that the same type of web request arrives, it always matches up with one (and only one) of the query sets in the pool. Therefore, it is difficult to identify traffic that matches this pattern. This happens only within dynamic websites, such as blogs or forum sites.
Attack Detection
There are number of attack performed by the attacker to retrieve the data from the web server or directly from the database the attacks performed by the attacker are
1. Privilege Escalation Attack
2. Hijack Future Session Attack
3. Injection Attack
4. Direct DB Attack
These attacks will be detected and controlled by using the detection algorithm. In this algorithm the structure of the query , session id , session time and the user id will be compared with the information stored in the database and the webserver the query will be processed only when the every condition will be satisfied otherwise the query will be neglected. Once the model is built, it can be used to detect malicious sessions.
4.1.5.1 Privilege Escalation Attack
Let’s assume that the website serves both regular users and administrators. As shown in figure 4.5, for a regular user, the web request ‘ru’ will trigger the set of SQL queries Qu; for an administrator, the request ‘ra’ will trigger the set of admin level queries Qa. Now suppose that an attacker logs into the webserver as a normal user, upgrades his/her privileges, and triggers admin queries so as to obtain an administrator’s data. This attack can never be detected by either the webserver IDS or the database IDS since both ru and Qa are legitimate requests and queries. Our approach, however, can detect this type of attack since the DB query Qa does not match the request ru, according to our mapping model.
For Privilege Escalation Attacks, according to our previous discussion, the attacker visits the website as a normal user aiming to compromise the webserver process or exploit vulnerabilities to bypass authentication. At that point, the attacker issues a set of privileged (e.g., admin-level) DB queries to retrieve sensitive information. We log and process both legitimate web requests and database queries in the session traffic, but there are no mappings among them. IDSs working at either end can hardly detect this attack since the traffic they capture appears to be legitimate. However, DoubleGuard separates the traffic by sessions. If it is a user session, then the requests and queries should all belong to normal users and match structurally. Using the mapping model that DoubleGuard can capture the unmatched cases.
4.1.5.2 Hijack Future Session Attack
This class of attacks is mainly aimed at the webserver side.An attacker usually takes over the webserver and therefore hijacks all subsequent legitimate user sessions to launch attacks. For instance, by hijacking other user sessions, the attacker can eavesdrop, send spoofed replies, and/or drop user requests. A session-hijacking attack can be further categorized as a Spoofing/Man-in-the-Middle attack, an Exfiltration Attack, a Denial-of-Service/Packet Drop attack, or a Replay attack. This is shown in figure 4.6.
According to the mapping model, the web request should invoke some database queries then the abnormal situation can be detected. However, neither a conventional webserver IDS nor a database IDS can detect such an attack by itself. Fortunately, the isolation property of our container based webserver architecture can also prevent this type of attack. As each user’s web requests are isolated into a separate container, an attacker can never break into other users’ sessions.
Out of the four classes of attacks we discuss, session hijacking is the most common, as there are many examples that exploit the vulnerabilities of Apache, IIS, PHP, ASP, and cgi, to name a few. Most of these attacks manipulate the HTTP requests to take over the webserver. . Here, we point out that most of these attacks are unsuccessful, and DoubleGuard captured these attacks mainly because of the abnormal HTTP requests. DoubleGuard can generate two classes of alerts. One class of alerts is generated by sessions whose traffic does not match the mapping model with abnormal database queries. The second class of alerts is triggered by sessions whose traffic violates the mapping model but only in regard to abnormal HTTP requests; there is no resulting database query. Most unsuccessful attacks, including 404 errors with no resulting database query, will trigger the second type of alerts. When the number of alerts becomes overwhelming, users can choose to filter the second type of alerts because it does not have any impact on the back-end database.
4.1.5.3 Injection Attack
Attacks such as SQL injection do not require compromising the webserver. Attackers can use existing vulnerabilities in the webserver logic to inject the data or string content that contains the exploits and then use the webserver to relay these exploits to attack the back-end database. Since our approach provides a two-tier detection, even if the exploits are accepted by the webserver, the relayed contents to the DB server would not be able to take on the expected structure for the given webserver request[7]. For instance, since the SQL injection attack changes the structure of the SQL queries, even if the injected data were to go through the webserver side, it would generate SQL queries in a different structure that could be detected as a deviation from the SQL query structure that would normally follow such a web request.
Attacks such as SQL injection do not require compromising the web server. Attackers can use existing vulnerabilities in the web server logic to inject the data or string content that contains the exploits and then use the web server to relay these exploits to attack the back-end database. Since our approach provides a two-tier detection, even if the exploits are accepted by the web server, the relayed contents to the DB server would not be able to take on the expected structure for the given web server request as shown in figure 4.7. For instance, since the SQL injection attack changes the structure of the SQL queries, even if the injected data were to go through the web server side, it would generate SQL queries in a different structure that couldbe detected as a deviation from the SQL query structure that would normally follow such a web request.
4.1.5.4 Direct DB Attack
It is possible for an attacker to bypass the webserver or firewalls and connect directly to the database. An attacker could also have already taken over the webserver and be submitting such queries from the webserver without sending web requests as shown in figure 4.8. Without matched web requests for such queries, a webserver IDS could detect neither. Furthermore, if these DB queries were within the set of allowed queries, then
the database IDS itself would not detect it either. However, this type of attack can be caught with our approach since we cannot match any web requests with these queries.