29-05-2012, 01:23 PM
Decision Trees
Decision Trees.ppt (Size: 555 KB / Downloads: 49)
Introduction
Decision Trees
Powerful/popular for classification & prediction
Represent rules
Rules can be expressed in English
IF Age <=43 & Sex = Male & Credit Card Insurance = NoTHEN Life Insurance Promotion = No
Rules can be expressed using SQL for query
Useful to explore data to gain insight into relationships of a large number of candidate input variables to a target (output) variable
You use mental decision trees often!
Game: “I’m thinking of…” “Is it …?”
Decision Tree – What is it?
A structure that can be used to divide up a large collection of records into successively smaller sets of records by applying a sequence of simple decision rules
A decision tree model consists of a set of rules for dividing a large heterogeneous population into smaller, more homogeneous groups with respect to a particular target variable
Decision Tree Types
Binary trees – only two choices in each split. Can be non-uniform (uneven) in depth
N-way trees or ternary trees – three or more choices in at least one of its splits (3-way, 4-way, etc.)
Scoring
Often it is useful to show the proportion of the data in each of the desired classes
Decision Tree Splits
The best split at root or child nodes is defined as one that does the best job of separating the data into groups where a single class predominates in each group
Example: US Population data input categorical variables/attributes include:
Zip code
Gender
Age
Split the above according to the above “best split” rule
Decision Tree Advantages
Easy to understand
Map nicely to a set of business rules
Applied to real problems
Make no prior assumptions about the data
Able to process both numerical and categorical data
Decision Trees.ppt (Size: 555 KB / Downloads: 49)
Introduction
Decision Trees
Powerful/popular for classification & prediction
Represent rules
Rules can be expressed in English
IF Age <=43 & Sex = Male & Credit Card Insurance = NoTHEN Life Insurance Promotion = No
Rules can be expressed using SQL for query
Useful to explore data to gain insight into relationships of a large number of candidate input variables to a target (output) variable
You use mental decision trees often!
Game: “I’m thinking of…” “Is it …?”
Decision Tree – What is it?
A structure that can be used to divide up a large collection of records into successively smaller sets of records by applying a sequence of simple decision rules
A decision tree model consists of a set of rules for dividing a large heterogeneous population into smaller, more homogeneous groups with respect to a particular target variable
Decision Tree Types
Binary trees – only two choices in each split. Can be non-uniform (uneven) in depth
N-way trees or ternary trees – three or more choices in at least one of its splits (3-way, 4-way, etc.)
Scoring
Often it is useful to show the proportion of the data in each of the desired classes
Decision Tree Splits
The best split at root or child nodes is defined as one that does the best job of separating the data into groups where a single class predominates in each group
Example: US Population data input categorical variables/attributes include:
Zip code
Gender
Age
Split the above according to the above “best split” rule
Decision Tree Advantages
Easy to understand
Map nicely to a set of business rules
Applied to real problems
Make no prior assumptions about the data
Able to process both numerical and categorical data