23-01-2012, 02:58 PM
Data Leakage Detection
Data_Leakage_Detection.doc (Size: 2 MB / Downloads: 443)
INTRODUCTION
iN the course of doing business, sometimes sensitive data must be handed over to supposedly trusted third parties. For example, a hospital may give patient records to researchers who will devise new treatments. Similarly, a company may have partnerships with other companies that require sharing customer data. Another enterprise may outsource its data processing, so data must be given to various other companies.
PROBLEM SETUP AND NOTATION
2.1 Entities and Agents
A distributor owns a set T ¼ ft1 ; ... ; tm g of valuable data objects. The distributor wants to share some of the objects with a set of agents U1 ; U2 ; ... ; Un , but does not wish the objects be leaked to other third parties. The objects in T could be of any type and size, e.g., they could be tuples in a relation, or relations in a database.
RELATED WORK
The guilt detection approach we present is related to the data provenance problem [3]: tracing the lineage of S objects implies essentially the detection of the guilty agents. Tutorial [4] provides a good overview on the research conducted in this field.
AGENT GUILT MODEL
To compute this P rfGi jSg, we need an estimate for the probability that values in S can be “guessed” by the target. For instance, say that some of the objects in S are e-mails of individuals. We can conduct an experiment and ask a person with approximately the expertise and resources of the target to find the e-mail of, say, 100 individuals