Data Mining: Characterization ppt

**study tips** · 28-06-2013, 02:13 PM

Data Mining: Characterization

.ppt

Data Mining.ppt (Size: 228 KB / Downloads: 190)

What is Concept Description?

Descriptive vs. predictive data mining
Descriptive mining: describes concepts or task-relevant data sets in concise, summarative, informative, discriminative forms
Predictive mining: Based on data and analysis, constructs models for the database, and predicts the trend and properties of unknown data
Concept description:
Characterization: provides a concise and succinct summarization of the given collection of data
Comparison: provides descriptions comparing two or more collections of data

Attribute-Oriented Induction

Proposed in 1989 (KDD ‘89 workshop)
Not confined to categorical data nor particular measures.
How it is done?
Collect the task-relevant data( initial relation) using a relational database query
Perform generalization by attribute removal or attribute generalization.
Apply aggregation by merging identical, generalized tuples and accumulating their respective counts.
Interactive presentation with users.

Mining Data Dispersion Characteristics

Motivation
To better understand the data: central tendency, variation and spread
Data dispersion characteristics
median, max, min, quantiles, outliers, variance, etc.
Numerical dimensions correspond to sorted intervals
Data dispersion: analyzed with multiple granularities of precision
Boxplot or quantile analysis on sorted intervals
Dispersion analysis on computed measures
Folding measures into numerical dimensions
Boxplot or quantile analysis on the transformed cube

Boxplot Analysis

Five-number summary of a distribution:
Minimum, Q1, M, Q3, Maximum
Boxplot
Data is represented with a box
The ends of the box are at the first and third quartiles, i.e., the height of the box is IRQ
The median is marked by a line within the box
Whiskers: two lines outside the box extend to Minimum and Maximum

17-07-2013, 06:41 PM

data mining ppt

**study tips** · 18-07-2013, 09:48 AM

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Human Computer Interface : Seminar Report and PPT	seminar post	1	1,337	22-09-2017, 11:23 AM Last Post: jaseela123
	4G Broadband : Seminar Report and PPT	study tips	1	1,261	22-09-2017, 11:19 AM Last Post: jaseela123
	Software Life-Cycle Models ppt	seminar flower	1	3,852	22-09-2017, 10:54 AM Last Post: jaseela123
	PPT ON LINUX	project girl	1	1,829	21-09-2017, 03:56 PM Last Post: jaseela123
	Public Key Infrastructure (Digital Certificates and Digital Signatures) PPT	project girl	1	2,364	21-09-2017, 01:18 PM Last Post: jaseela123
	Itanium Processor : Seminar Report and PPT	seminar projects maker	1	1,052	21-09-2017, 12:46 PM Last Post: jaseela123
	Design and Analysis Of Algorithms : Seminar Report and PPT	seminar projects maker	1	1,315	21-09-2017, 12:04 PM Last Post: jaseela123
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123
	Data Mining: What is Data Mining? Report	project girl	1	2,262	21-09-2017, 11:47 AM Last Post: jaseela123
	Biometric Authentication PPT	project girl	1	1,109	19-09-2017, 02:32 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.