01-06-2012, 02:55 PM
WEKA
WEKA.ppt (Size: 3.84 MB / Downloads: 303)
What is WEKA?
Waikato Environment for Knowledge Analysis
It’s a data mining/machine learning tool developed by Department of Computer Science, University of Waikato, New Zealand.
Weka is also a bird found only on the islands of New Zealand.
Main GUI
Three graphical user interfaces
“The Explorer” (exploratory data analysis)
“The Experimenter” (experimental environment)
“The KnowledgeFlow” (new process model inspired interface)
Explorer: pre-processing the data
Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary
Data can also be read from a URL or from an SQL database (using JDBC)
Pre-processing tools in WEKA are called “filters”
WEKA contains filters for:
Discretization, normalization, resampling, attribute selection, transforming and combining attributes, …
Explorer: building “classifiers
Classifiers in WEKA are models for predicting nominal or numeric quantities
Implemented learning schemes include:
Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, …
Algorithm for Decision Tree Induction
Basic algorithm (a greedy algorithm)
Tree is constructed in a top-down recursive divide-and-conquer manner
At start, all the training examples are at the root
Attributes are categorical (if continuous-valued, they are discretized in advance)
Examples are partitioned recursively based on selected attributes
Test attributes are selected on the basis of a heuristic or statistical measure (e.g., information gain)