06-05-2012, 07:12 PM
i want full document for anonymous pub lication of sensitive transactional data for my review
ABSTRACT
Existing research on privacy-preserving data publishing focuses on relational data. Existing techniques work well for fixed schema data, with low dimensionality. Some certain applications require privacy-preserving publishing of transactional data, which involve hundreds or even thousands of dimensions, rendering existing methods unusable. The objective is to enforce privacy-preserving paradigms, such as k-anonymity and diversity, while minimizing the information loss incurred in the anonymizing process (i.e., maximize data utility). We propose two categories of novel anonymization methods for sparse high-dimensional data. The first category is based on approximate nearest-neighbor (NN) search in high-dimensional spaces, which is efficiently performed through locality-sensitive hashing (LSH). In the second category, we propose two data transformations that capture the correlation in the underlying data: (i) reduction to a band matrix and (ii) Gray encoding-based sorting. The data transformation based on Gray code sorting performs best in terms of both data utility and execution time.