Privacy Preserving Data Mining PPT

**project girl** · 03-12-2012, 05:42 PM

Privacy Preserving Data Mining

.pptx

data mining final ppt.pptx (Size: 103.94 KB / Downloads: 48)

Difference between security and privacy

Data security, according to common definition is the “confidentiality, integrity and availability” of data.
Privacy, on the other hand, is the appropriate use of information.

Data Mining

Data mining is a recently emerging field , connecting the three worlds of Databases,Artificial Intelligence and Statistics.
The information age has enabled many organizations to gather large volumes of data. However, the usefulness of this data is negligible if “meaningful information” or “knowledge” cannot be extracted from it.
Data mining, otherwise known as knowledge discovery,attempts to answer this need.

Privacy Preserving data mining

Privacy preserving data mining has become increasingly popular because it allows sharing of privacy sensitive data for analysis purposes .So people have become increasingly unwilling to share their data, frequently resulting in individuals either refusing to share their data or providing incorrect data.
In recent years, privacy preserving data mining has been studied extensively, because of the wide proliferation of sensitive information on the internet.
The problem of privacy-preserving data mining has become more important in recent years because of the increasing ability to store personal data about users, and the increasing sophistication of data mining algorithms to
leverage this information.

Method of anonymization

When releasing micro data for research purposes, one needs to limit disclosure risks to an acceptable level while maximizing data utility.
To limit disclosure risk, introduced the k-anonymity privacy requirement, which requires each record in an anonymized table to be indistinguishable with at least k other records within the dataset, with respect to a set of quasi-identifier attributes.
To achieve the k-anonymity requirement, they used both generalization and suppression for data anonymization.

ANONYMIZATION TECHNIQUE

Merits :

This method is used to protect respondents' identities while releasing truthful information. While k-anonymity protects against identity disclosure, it does not provide sufficient protection against attribute disclosure.

Demerits:

There are two attacks: the homogeneity attack and the background knowledge attack. Because the limitations of the k-anonymity model stem from the two assumptions. First, it may be very hard for the owner of a database to determine which of the attributes are or are not available in external tables.
The second limitation is that the k-anonymity model assumes a certain method of attack, while in real scenarios there is no reason why the attacker should not try other methods.

Perturbation approach

The perturbation approach works under the need that the data service is not allowed to learn or recover precise records. This restriction naturally leads to some challenges. Since the method does not reconstruct the original data values but only distributions, new algorithms need to be developed which use these reconstructed distributions in order to perform mining of the underlying data.
This means that for each individual data problem such as classification, clustering, or association rule mining, a new distribution based data mining algorithm needs to be developed.

Condensation approach

Condensation approach, which constructs constrained clusters in the data set, and then generates pseudo-data from the statistics of these clusters . We refer to the technique as condensation because of its approach of using condensed statistics of the clusters in order to generate pseudo-data.
This technique called as condensation because of its approach of using condensed statistics of the clusters in order to generate pseudo-data.

Distributed Privacy Preserving Data Mining

The key goal in most distributed methods for privacy-preserving data mining (PPDM) is to allow computation of useful aggregate statistics over the entire data set without compromising the privacy of the individual data sets within the different participants. Thus, the participants may wish to collaborate in obtaining aggregate results, but may not fully trust each other in terms of the distribution of their own data sets. For this purpose, the data sets may either be horizontally partitioned or be vertically partitioned.In horizontally partitioned data sets, the individual records are spread out across multiple entities, each of which has the same set of attributes. In vertical partitioning, the individual entities may have different attributes (or views) of the same set of records.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Human Computer Interface : Seminar Report and PPT	seminar post	1	1,337	22-09-2017, 11:23 AM Last Post: jaseela123
	4G Broadband : Seminar Report and PPT	study tips	1	1,261	22-09-2017, 11:19 AM Last Post: jaseela123
	Software Life-Cycle Models ppt	seminar flower	1	3,852	22-09-2017, 10:54 AM Last Post: jaseela123
	PPT ON LINUX	project girl	1	1,829	21-09-2017, 03:56 PM Last Post: jaseela123
	Public Key Infrastructure (Digital Certificates and Digital Signatures) PPT	project girl	1	2,364	21-09-2017, 01:18 PM Last Post: jaseela123
	Itanium Processor : Seminar Report and PPT	seminar projects maker	1	1,052	21-09-2017, 12:46 PM Last Post: jaseela123
	Design and Analysis Of Algorithms : Seminar Report and PPT	seminar projects maker	1	1,315	21-09-2017, 12:04 PM Last Post: jaseela123
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123
	Data Mining: What is Data Mining? Report	project girl	1	2,262	21-09-2017, 11:47 AM Last Post: jaseela123
	Biometric Authentication PPT	project girl	1	1,109	19-09-2017, 02:32 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.