Redundant Arrays of Inexpensive Disks (RAID)

Review by James Ezick, February, 1998; updated by Vincent Ng, March 9, 1999

State of the World (c. 1987)

The rate of increase in the number of CPU instructions per second is far outpacing the increases in I/O transfer rates. This difference is due in some part to the mechanical nature of the physical disks. Seek time and latency are still by far the greatest inhibitors to transfer rate.
Amdahl's Law suggests that in the coming years the effective improvement of increased CPU's will be diminished by an I/O bottleneck. In essence, the improved CPU performance will be wasted.
There exist on the market today a number of brands of physical disks covering a wide range of performance characteristics. These include Single Large Expensive Disks (SLED) to serve commercial purposes and smaller, cheaper disks to serve the growing home PC market.
For the cheaper disks, I/O's per second per actuator are within a factor of two of the larger disks. Further, the smaller disks, in many cases, are superior in cost per megabyte and power consumption per megabyte.
Standards such as SCSI have allowed companies to embed mainframe controller functions at low cost into the less expensive disks.
These factors would suggest that inexpensive disks could be used in arrays to emulate the function of larger disks. The inhibiting factor to this approach is that inexpensive disks provide inferior reliabily, and any system relying upon banks of them offer a dismal collective level of fault tolerance.

The Idea

The increased failure rate of inexpensive disks could be tolerated if a system of disks could be constructed in such as way as to tolerate single disk failures. This is accomplished by supplementing a bank of disks with additional (check) disks that store redundant or checksum data. These would allow the recovery of data from a defective disk.
This redundancy, however, detracts from I/O performance as well as the total megabytes per dollar ratio.
By varying the level of fault tolerance five distinct levels of redundancy were defined.

RAID Levels

Level 1

Mirrored Disks: All data is backed up completely.

50% of the storage capacity is wasted, cost of the system doubled. All reads / writes occur twice.

Level 2

Detect and correct single bit errors.

Bit-interleaving the data across all disks of a group.

Data must be read / written in entire blocks to allow the checksum calculation.

Need enough check disk to identify the disk with the error: Ten data disks require four check disks. Five data disks require five check disks.

Good for supercomputer applications, bad for transaction processing.

Level 3

Utilize the disk hardware to detect disk failure, rely on a single check disk to recover from the error: effect performance per disk increases.

Lowest possible reliability overhead.

Level 4

Interleave data between disks on a sector level.

Improve performance of small transfers through parallelism for reads.

A small write uses 2 disks to perform 2 reads and 2 writes; parity calculation is simpler than that in level 3.

Number of writes that can be performed simultaneously is limited to the number of groups: every write to a group must read and write the check disk.

Level 5

Support multiple individual writes per group.

Small read-modify-writes perform close to the speed of the disk.

Large transfer performance per disk and high useful storage capacity percentage similar to levels three and four.

Conclusions

Level 5 RAID offers a factor of roughly ten improvement in performance, reliability and power consumption and a factor of three reduction in size over a SLED.
RAID offers the advantage of modular growth.
Lower power consumption makes battery backups feasible.

Questions

To what extent are RAID systems scalable?
Is the mean time to failure calculation accurate? Would we expect the failures to more closely approximate a normal distribution? Does this change the expected gap between failures.
Would we expect failures to still occur independently for the interconnected disks?
How do you like the "level" nomenclature?