03-07-2012, 12:43 PM
File Organization and Indexing
FileOrganization.ppt (Size: 243 KB / Downloads: 25)
In this lecture you will learn
How DBMS physically organizes data
Different file organizations or access methods
What is Indexing?
Different indexing methods
How to create indexes using SQL
Introduction
DBMS has to store data somewhere
Choices:
Main memory
Expensive – compared to secondary and tertiary storage
Fast – in memory operations are fast
Volatile – not possible to save data from one run to its next
Used for storing current data
Secondary storage (hard disk)
Less expensive – compared to main memory
Slower – compared to main memory, faster compared to tapes
Persistent – data from one run can be saved to the disk to be used in the next run
Used for storing the database
Tertiary storage (tapes)
Cheapest
Slowest – sequential data access
Used for data archives
Basics of Data storage on hard disk
A disk is organized into a number of blocks or pages
A page is the unit of exchange between the disk and the main memory
A collection of pages is known as a file
DBMS stores data in one or more files on the hard disk
Database Tables on Hard Disk
Database tables are made up of one or more tuples (rows)
Each tuple has one or more attributes
One or more tuples from a table are written into a page on the hard disk
Larger tuples may need more than one page!
Tuples on the disk are known as records
Records are separated by record delimiter
Attributes on the hard disk are known as fields
Fields are separated by field delimiter
Secondary Indexes
An index file that uses a non primary field as an index e.g. City field in the branch table
They improve the performance of queries that use attributes other than the primary key
You can use a separate index for every attribute you wish to use in the WHERE clause of your select query
But there is the overhead of maintaining a large number of these indexes