18-10-2010, 11:50 AM
Mining Complex Types of Data
Multidimensional analysis and descriptive mining of complex data objects
Mining spatial databases
Mining time-series and sequence data
Mining the World-Wide Web to be covered Dec. 4, if time
Summary
Mining Complex Data Objects: Generalization of Structured Data
Set-valued attribute
Generalization of each value in the set into its corresponding higher-level concepts
Derivation of the general behavior of the set, such as the number of elements in the set, the types or value ranges in the set, or the weighted average for numerical data
E.g., hobby = {tennis, hockey, chess, violin, nintendo_games} generalizes to {sports, music, video_games}
List-valued or a sequence-valued attribute
Same as set-valued attributes except that the order of the elements in the sequence should be observed in the generalization
Generalizing Spatial and Multimedia Data
Spatial data:
Generalize detailed geographic points into clustered regions, such as business, residential, industrial, or agricultural areas, according to land usage
Require the merge of a set of geographic areas by spatial operations
Image data:
Extracted by aggregation and/or approximation
Size, color, shape, texture, orientation, and relative positions and structures of the contained objects or regions in the image
Music data:
Summarize its melody: based on the approximate patterns that repeatedly occur in the segment
Summarized its style: based on its tone, tempo, or the major musical instruments played
An Example: Plan Mining by Divide and Conquer
Plan: a variable sequence of actions
E.g., Travel (flight): <traveler, departure, arrival, d-time, a-time, airline, price, seat>
Plan mining: extraction of important or significant generalized (sequential) patterns from a planbase (a large collection of plans)
E.g., Discover travel patterns in an air flight database, or
find significant patterns from the sequences of actions in the repair of automobiles
Method
Attribute-oriented induction on sequence data
A generalized travel plan: <small-big*-small>
Divide & conquer:Mine characteristics for each subsequence
E.g., big*: same airline, small-big: nearby region
For more information about this article,please follow the link:
http://www.googleurl?sa=t&source=web&cd=...m_mct1.ppt&ei=VOa7TLWaDISycNGB4cIM&usg=AFQjCNGk3GjWb40JdBGWClNnbV41-NgqvA