24-07-2012, 12:09 PM
Robust Query Processing through Progressive Optimization
progressive-query-opt.ppt (Size: 770.5 KB / Downloads: 20)
Motivation
Current optimizers depend heavily upon the cardinality estimations
What if there errors in those estimations?
Errors can occur due to …
Inaccurate statistics
Invalid assumptions (e.g. attribute independence)
Progressive Query Optimization
Idea: lazily trigger reoptimization during execution if cardinality counts indicate current plan is suboptimal
introduces checkpoint (CHECK) operator to compare actual vs estimated cardinality
key idea: precompute cardinality ranges for which plan is optimal
Evaluating a re-optimization scheme
Risk Vs Opportunity
Risk:
Extent to which re-optimization is not worthwhile
reoptimization chooses another bad plan
work redone
cardinality errors may even cancel, and fixing one may give an even worse plan!
Opportunity:
Refers to the aggressiveness
more CHECK operators..
Background
Redbrick
Star schema with fact table and multiple dimension tables
First apply selections on dimension tables
Then decide what plan to use
Kabra & DeWitt 98 (KD98)
Introduced idea of mid-query reoptimization
Allow partial results to be use like materialized views
But ad-hoc cardinality threshold, and only reoptimize fully materialized plans
Tukwila data integration system
optimizer may have no idea of statistics
interleave optimization and query execution
partial query plans
Fragment: fully pipelined tree with doubly pipelined hash join
Query Scrambling
reorder query to deal with delayed sources