Data Leakage Detection simple report

**seminar addict** · 23-01-2012, 02:58 PM

Data Leakage Detection

.doc

Data_Leakage_Detection.doc (Size: 2 MB / Downloads: 443)

INTRODUCTION
iN the course of doing business, sometimes sensitive data must be handed over to supposedly trusted third parties. For example, a hospital may give patient records to researchers who will devise new treatments. Similarly, a company may have partnerships with other companies that require sharing customer data. Another enterprise may outsource its data processing, so data must be given to various other companies.

PROBLEM SETUP AND NOTATION
2.1 Entities and Agents
A distributor owns a set T ¼ ft1 ; ... ; tm g of valuable data objects. The distributor wants to share some of the objects with a set of agents U1 ; U2 ; ... ; Un , but does not wish the objects be leaked to other third parties. The objects in T could be of any type and size, e.g., they could be tuples in a relation, or relations in a database.

RELATED WORK
The guilt detection approach we present is related to the data provenance problem [3]: tracing the lineage of S objects implies essentially the detection of the guilty agents. Tutorial [4] provides a good overview on the research conducted in this field.

AGENT GUILT MODEL
To compute this P rfGi jSg, we need an estimate for the probability that values in S can be “guessed” by the target. For instance, say that some of the objects in S are e-mails of individuals. We can conduct an experiment and ask a person with approximately the expertise and resources of the target to find the e-mail of, say, 100 individuals

**seminar addict** · 03-02-2012, 11:07 AM

to get information about the topic DATA LEAKAGE DETECTION full report ,ppt and related topic refer the link bellow

https://seminarproject.net/Thread-data-l...ion--22106

https://seminarproject.net/Thread-data-l...ple-report

https://seminarproject.net/Thread-digita...-detection

divyaanil · 30-05-2012, 04:38 PM

I WANT THE FULL PROJECT REPORT ON DATA LEAKAGE DETECTION

**seminar ideas** · 01-08-2012, 01:58 PM

Data Leakage Detection simple report

.ppt

Data Leakage Detection (1).ppt (Size: 1.16 MB / Downloads: 99)

Introduction

In the course of doing business, sometimes sensitive data must be handed over to supposedly trusted third parties.
Our goal is to detect when the distributor's sensitive data has been leaked by agents, and if possible to identify the agent that leaked the data.

Types of employees that put your company at risk.

The security illiterate
Majority of employees with little or no knowledge of security
Corporate risk because of accidental breaches
The gadget nerds
Introduce a variety of devices to their work PCs
Download software
The unlawful residents
Use the company IT resources in ways they shouldn't
i.e., by storing music, movies, or playing games
The malicious/disgruntled employees
Typically minority of employees
Gain access to areas of the IT system to which they shouldn’t
Send corporate data (e.g., customer lists, R&D, etc.) to third parties

Issues

We develop a model for assessing the “guilt” of agents. We also present algorithms for distributing objects to agents, in a way that improves our chances of identifying a leaker.

Finally, we also consider the option of adding “fake” objects to the distributed set.

Such objects do not correspond to real entities but appear realistic to the agents.

In a sense, the fake objects acts as a type of watermark for the entire set, without modifying any individual members. If it turns out an agent was given one or more fake objects that were leaked, then the distributor can be more confident that agent was guilty.

Optimization :
The Optimization Module is the distributor’s data allocation to agents has one constraint and one objective. The distributor’s constraint is to satisfy agents’ requests, by providing them with the number of objects they request or with all available objects that satisfy their conditions. His objective is to be able to detect an agent who leaks any portion of his data.

**project girl** · 16-11-2012, 12:06 PM

Data Leakage Detection

.doc

Data Leakage.doc (Size: 2 MB / Downloads: 55)

Abstract

We study the following problem: A data distributor has given sensitive data to a set of supposedly trusted agents (third parties). Some of the data are leaked and found in an unauthorized place (e.g., on the web or somebody’s laptop). The distributor must assess the likelihood that the leaked data came from one or more agents, as opposed to having been independently gathered by other means. We propose data allocation strategies (across the agents) that improve the probability of identifying leakages. These methods do not rely on alterations of the released data (e.g., watermarks). In some cases, we can also inject “realistic but fake” data records to further improve our chances of detecting leakage and identifying the guilty party.

INTRODUCTION

N the course of doing business, sometimes sensitive data must be handed over to supposedly trusted third parties. For example, a hospital may give patient records to researchers who will devise new treatments. Similarly, a company may have partnerships with other companies that require sharing customer data. Another enterprise may outsource its data processing, so data must be given to various other companies. We call the owner of the data the distributor and the supposedly trusted third parties the agents. Our goal is to detect when the distributor’s sensitive data have been leaked by agents, and if possible to identify
the agent that leaked the data.
We consider applications where the original sensitive data cannot be perturbed. Perturbation is a very useful technique where the data are modified and made “less sensitive” before being handed to agents. For example, one can add random noise to certain attributes, or one can replace exact values by ranges [18]. However, in some cases, it is important not to alter the original distributor’s data. For example, if an outsourcer is doing our payroll, he must have the exact salary and customer bank account numbers. If medical researchers will be treating patients (as opposed to simply computing statistics), they may need accurate data for the patients.

RELATED WORK

The guilt detection approach we present is related to the data provenance problem [3]: tracing the lineage of S objects implies essentially the detection of the guilty agents. Tutorial [4] provides a good overview on the research conducted in this field. Suggested solutions are domain specific, such as lineage tracing for data ware- houses [5], and assume some prior knowledge on the way a data view is created out of data sources. Our problem formulation with objects and sets is more general and simplifies lineage tracing, since we do not consider any data transformation from Ri sets to S.
As far as the data allocation strategies are concerned, our
work is mostly relevant to watermarking that is used as a means of establishing original ownership of distributed objects. Watermarks were initially used in images [16], video [8], and audio data [6] whose digital representation includes considerable redundancy. Recently, [1], [17], [10], [7], and other works have also studied marks insertion to relational data. Our approach and watermarking are similar in the sense of providing agents with some kind of receiver identifying information. However, by its very nature, a watermark modifies the item being watermarked. If the object to be watermarked cannot be modified, then a watermark cannot be inserted. In such cases, methods that attach watermarks to the distributed data are not applicable.

AGENT GUILT MODEL

To compute this P rfGi jSg, we need an estimate for the probability that values in S can be “guessed” by the target. For instance, say that some of the objects in S are e-mails of individuals. We can conduct an experiment and ask a person with approximately the expertise and resources of the target to find the e-mail of, say, 100 individuals. If this person can find, say, 90 e-mails, then we can reasonably guess that the probability of finding one e-mail is 0.9. On the other hand, if the objects in question are bank account numbers, the person may only discover, say, 20, leading to an estimate of 0.2. We call this estimate pt , the probability that object t can be guessed by the target.

DATA ALLOCATION PROBLEM

The main focus of this paper is the data allocation problem: how can the distributor “intelligently” give data to agents in order to improve the chances of detecting a guilty agent? As illustrated in Fig. 2, there are four instances of this problem we address, depending on the type of data requests made by agents and whether “fake objects” are allowed.
The two types of requests we handle were defined in
Section 2: sample and explicit. Fake objects are objects generated by the distributor that are not in set T . The objects are designed to look like real objects, and are distributed to agents together with T objects, in order to increase the chances of detecting agents that leak data. We discuss fake objects in more detail in Section 6.1.

Optimization Problem

The distributor’s data allocation to agents has one constraint and one objective. The distributor’s constraint is to satisfy agents’ requests, by providing them with the number of objects they request or with all available objects that satisfy their conditions. His objective is to be able to detect an agent who leaks any portion of his data.
We consider the constraint as strict. The distributor may
not deny serving an agent request as in [13] and may not provide agents with different perturbed versions of the same objects as in [1]. We consider fake object distribution as the only possible constraint relaxation.

Our detection objective is ideal and intractable. Detection

would be assured only if the distributor gave no data object to any agent (Mungamuru and Garcia-Molina [11] discuss that to attain “perfect” privacy and security, we have to sacrifice utility). We use instead the following objective: maximize the chances of detecting a guilty agent that leaks all his data objects.

Approximate Sum-Objective Minimization

The last example showed that we can minimize the sum- objective, and therefore, increase the chances of detecting a guilty agent, on average, by providing agents who have small requests with the objects shared among the fewest agents. This way, we improve our chances of detecting guilty agents with small data requests, at the expense of reducing our chances of detecting guilty agents with large data requests. However, this expense is small, since the probability to detect a guilty agent with many objects is less affected by the fact that other agents have also received his data (see Section 5.2). In [14], we provide an algorithm that implements this intuition and we denote it by s-sum. Although we evaluate this algorithm in Section 8, we do not present the pseudocode here due to the space limitations.

CONCLUSIONS

In a perfect world, there would be no need to hand over sensitive data to agents that may unknowingly or mal- iciously leak it. And even if we had to hand over sensitive data, in a perfect world, we could watermark each object so that we could trace its origins with absolute certainty. However, in many cases, we must indeed work with agents that may not be 100 percent trusted, and we may not be certain if a leaked object came from an agent or from some other source, since certain data cannot admit watermarks.
In spite of these difficulties, we have shown that it is possible to assess the likelihood that an agent is responsible for a leak, based on the overlap of his data with the leaked data and the data of other agents, and based on the probability that objects can be “guessed” by other means. Our model is relatively simple, but we believe that it captures the essential trade-offs. The algorithms we have presented implement a variety of data distribution strate- gies that can improve the distributor’s chances of identify- ing a leaker. We have shown that distributing objects judiciously can make a significant difference in identifying guilty agents, especially in cases where there is large overlap in the data that agents must receive.

23-03-2013, 09:16 AM

plz send me d report of data leakage detection

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Biometrics Security System Full Download Seminar Report and Paper Presentation	computer science crazy	30	190,561,110	24-02-2021, 08:13 AM Last Post: buy cialis generic
	Ultrasonic Trapping In Capillaries For Trace-Amount Bi (Download Full Seminar Report)	Computer Science Clay	2	104,277,107	17-01-2018, 11:59 AM Last Post: dhanabhagya
	nanorobotics full report	project topics	24	176,551,278	16-01-2018, 05:50 PM Last Post: Guest
	robotic surgery full report	project report tiger	16	150,961,205	07-01-2018, 07:28 PM Last Post: Raymondnof
	Human Computer Interface : Seminar Report and PPT	seminar post	1	1,337	22-09-2017, 11:23 AM Last Post: jaseela123
	4G Broadband : Seminar Report and PPT	study tips	1	1,261	22-09-2017, 11:19 AM Last Post: jaseela123
	Amoeba full report	project topics	1	1,631,984	22-09-2017, 10:38 AM Last Post: jaseela123
	Itanium Processor : Seminar Report and PPT	seminar projects maker	1	1,052	21-09-2017, 12:46 PM Last Post: jaseela123
	Design and Analysis Of Algorithms : Seminar Report and PPT	seminar projects maker	1	1,315	21-09-2017, 12:04 PM Last Post: jaseela123
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.