05-05-2012, 12:03 PM
RAID TECHNOLOGY
RAID TECHNOLOGY REPORT.docx (Size: 378.6 KB / Downloads: 37)
Introduction:
RAID, an acronym for Redundant Array of Independent Disks (Changed from its original term Redundant Array of Inexpensive Disks), is a technology that provides increased storage functions and reliability through redundancy. This is achieved by combining multiple disk drive components into a logical unit, where data is distributed across the drives in one of several ways called "RAID levels". This concept was first defined by David A. Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987 as Redundant Arrays of Inexpensive Disks. Marketers representing industry RAID manufacturers later attempted to reinvent the term to describe a redundant array of independent disks as a means of dissociating a low-cost expectation from RAID technology.
RAID is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple disk drives. The schemes or architectures are named by the word RAID followed by a number. The various designs of RAID systems involve two key goals: increase data reliability and increase input/output performance. When multiple physical disks are set up to use RAID technology, they are said to be in a RAID array. This array distributes data across multiple disks, but the array is addressed by the operating system as one single disk. RAID can be set up to serve several different purposes.
Disk Mirroring:
It is recognized that disks are an inherently unreliable component of computer systems. Mirroring is a technique to allow a system to automatically maintain multiple copies of data so that in the event of a disk hardware failure a system can continue to process or quickly recover data. Mirroring may be done locally where it is specifically to cater for disk unreliability, or it may be done remotely where it forms part of a more sophisticated disaster recovery scheme, or it may be done both locally and remotely, especially for high availability systems. Normally data is mirrored onto physically identical drives, though the process can be applied to logical drives where the underlying physical format is hidden from the mirroring process.
Typically mirroring is provided in either hardware solutions such as disk arrays or in software within the operating system. There are several scenarios for what happens when a disk fails. In a hot swap system, in the event of a disk failure, the system itself typically diagnoses a disk failure and signals a failure. Sophisticated systems may automatically activate a hot standby disk and use the remaining active disk to copy live data onto this disk. Alternatively, a new disk is installed and the data is copied to it. In less sophisticated systems, the system is operated on the remaining disk until such time as a spare disk can be installed with minimum disruption.
The copying of data from one pair of a mirror to another is sometimes called resilvering though more commonly it is simply known as rebuilding. During the rebuilding process, system performance is usually degraded as the disk system is fully occupied in copying data from one disk to the other.
It is often misunderstood that mirroring of disks is a substitute for taking regular backups as it is incorrectly assumed that the only cause of data loss is disk failure. In fact the most trivial of user actions can delete data which then would need to be recovered, and in commercial operations it is far more likely that backups are used to recover from processing errors, user mistakes or vandalism, none of which are protected against by mirroring.
Mirroring can be performed site to site either by rapid data links, for example fibre optic links, which over distances of 500m or so can maintain adequate performance to support real-time mirroring. Longer distances or slower links maintain mirrors using an asynchronous copying system. For remote disaster recovery systems, this mirroring may not be done by integrated systems but simply by additional applications on master and slave machines. It is differentiated from a snapshot in that there are no remaining links between the original (or source) and the copy.
In addition to providing an additional copy of the data for the purpose of redundancy in case of hardware failure, disk mirroring can allow each disk to be accessed separately for reading purposes. Under certain circumstances, this can significantly improve performance as the system can choose for each read which disk can seek most quickly to the required data. This is especially significant where there are several tasks competing for data on the same disk, and thrashing (where the switching between tasks takes up more time than the task itself) can be reduced. This is an important consideration in hardware configurations that frequently access the data on the disk. In some implementations, the mirrored disk can be split off and used for data backup, allowing the first disk to remain active. However merging the two disks then may require a synchronization period if any write I/O activity has occurred to the mirrored disk.
Disk Striping:
This is a method of combining multiple drives into one logical storage unit. Striping partitions the storage space of each drive into stripes, which can be as small as one sector (512 bytes) or as large as several megabytes.
These stripes are then interleaved in a rotating sequence, so that the combined space is composed alternately of stripes from each drive. To maximize throughput for the disk subsystem, the I/O load must be balanced across all the drives so that each drive can be kept busy as much as possible. This situation allows all drives to work concurrently on different I/O operations, and thus maximize the number of simultaneous I/O operations that can be performed by the array.
Data Striping
In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, in a way that accesses of sequential segments are made to different physical storage devices. Striping is useful when a processing device requests access to data more quickly than a storage device can provide access. By performing segment accesses on multiple devices, multiple segments can be accessed concurrently. This provides more data access throughput, which avoids causing the processor to idly wait for data accesses.
One method of striping is done by interleaving sequential segments on storage devices in a round-robin fashion from the beginning of the data sequence. This works well for streaming data, but subsequent random accesses will require knowledge of which device contains the data. If the data is stored such that the physical address of each data segment is assigned a 1-to-1 mapping to a particular device, the device to access each segment requested can be calculated from the address without knowing the offset of the data within the full sequence.
Advantages of striping include performance and throughput. Sequential time interleaving of data accesses allows the lesser data access throughput of each storage devices to be cumulatively multiplied by the number of storage devices employed. Increased throughput allows the data processing device to continue its work without interruption, and thereby finish its procedures more quickly. This is manifested in improved performance of the data processing.
Because different segments of data are kept on different storage devices, the failure of one device causes the corruption of the full data sequence. In effect, the failure rate of the array of storage devices is equal to the sum of the failure rate of each storage device. This disadvantage of striping can be overcome by the storage of redundant information, such as parity, for the purpose of error correction. In such a system, the disadvantage is over come at the cost of requiring extra storage.
Parity:
Parity data is used by some RAID levels to achieve redundancy. If a drive in the array fails, remaining data on the other drives can be combined with the parity data to reconstruct the missing data.
Parity Bit
A parity bit is a bit that is added to ensure that the number of bits with the value one in a set of bits is even or odd. Parity bits are used as the simplest form of error detecting code.
There are two variants of parity bits: even parity bit and odd parity bit. When using even parity, the parity bit is set to 1 if the number of ones in a given set of bits (not including the parity bit) is odd, making the entire set of bits (including the parity bit) even. When using odd parity, the parity bit is set to 1 if the number of ones in a given set of bits (not including the parity bit) is even, keeping the entire set of bits (including the parity bit) odd. In other words, an even parity bit will be set to "1" if the number of 1's + 1 is even, and an odd parity bit will be set to "1" if the number of 1's +1 is odd.
Even parity is a special case of a cyclic redundancy check (CRC), where the 1-bit CRC is generated by the polynomial x+1. If the parity bit is present but not used, it may be referred to as mark parity (when the parity bit is always 1) or space parity (the bit is always 0).
Parity data is used by some RAID levels to achieve redundancy. If a drive in the array fails, remaining data on the other drives can be combined with the parity data (using the Boolean XOR function) to reconstruct the missing data.
For example, suppose two drives in a three-drive RAID 5 array contained the following data:
Drive 1: 01101101
Drive 2: 11010100
To calculate parity data for the two drives, an XOR is performed on their data:
01101101
XOR 11010100
-------------------
10111001
The resulting parity data, 10111001, is then stored on Drive 3.
Should any of the three drives fail, the contents of the failed drive can be reconstructed on a replacement drive by subjecting the data from the remaining drives to the same XOR operation. If Drive 2 were to fail, its data could be rebuilt using the XOR results of the contents of the two remaining drives, Drive 1 and Drive 3:
10111001
XOR 01101101
-------------------
11010100
The result of that XOR calculation yields Drive 2's contents. 11010100 is then stored on Drive 2, fully repairing the array. This same XOR concept applies similarly to larger arrays, using any number of disks. In the case of a RAID 3 array of 12 drives, 11 drives participate in the XOR calculation shown above and yield a value that is then stored on the dedicated parity drive.