Raiders of The Lost Disk

Engine Crew Monograph No. 12
Last updated July 20, 1997

Gus Bjorklund, Progress Software Corporation

Introduction

The RAID concept (Redundant Arrays of Inexpensive Disks) was first proposed in 1987 by researchers Patterson, Gibson, and Katz at the University of California at Berkeley. They proposed that the cost, performance, size limitations, and reliability of storage subsystems could be improved by combining several small disks into a disk "array" that is perceived as a single disk by applications (and the operating system). They described five ways of combining disks, numbered 1 through 5. Later, RAID 0 was added to the list. Each configuration has different costs, benefits, and drawbacks.

All RAID implementations are completely transparent to applications that use them. No special coding is required. An application that is accessing data on a RAID system does not know anything about its existence. Progress does not know when you are using RAID. The operating system may not know either if you have an independent RAID storage subsystem.

RAID 0

RAID level 0 is also called "striping". In this configuration, a group of disks act as a single disk (a "stripe set") with data striped across all the disks. The striping strategy is that the total storage space divided into equal-size sections or "stripe blocks" that are allocated round-robin among all the disks. The size of the striper block is configurable is normally a megabyte long or longer.

Benefits

RAID 1

RAID level 1 is also called "mirroring". In this configuration, a duplicate copy of each disk is stored on a second "mirror" drive. All data are written to both sides of the mirror. This provides redundancy and fault tolerance because each disk is duplicated. If one fails, its mirrors can still be accessed and no data are lost because of the failure. One can configure RAID 1 to have more than one duplicate (e.g. "triple mirroring") copy of each disk.

Benefits

RAID 2

RAID level 2 combines striping and redundancy. The striping strategy is to spread the data across multiple disks at the bit level. Redundancy is provided by writing an error correcting code (a Hamming code, for you buzzword fans) across several additional disks. The error correcting code has the property that data lost from failure of one data disk can be recovered by reading from one of the error correction disks.

Benefits

RAID 3

RAID level 3 combines the same fine-grained striping strategy as level 2, but uses only one additional disk for error correction instead of several. The error correction code is compute by taking the exclusive or of all the data disks.

Benefits

RAID 4

RAID level 4 is similar to level 3 except that the striping strategy is like RAID 0 (relatively large stripe blocks) rather than at the bit level. This technique allows reconstruction of all data that was present on any single drive that has failed.

Benefits

RAID 5

In RAID level 5 configurations, data are striped across several disks along with "parity" data. The striping strategy is the same as for RAID 0 (relatively large stripe blocks). The parity data is distributed across the drives in such a way that a data group and its parity information are always written to different devices. This technique allows reconstruction of all data that was present on any single drive that has failed.

Benefits

Implementation types

There are three common RAID implementations: in the operating system, in the disk controller, in a separate subsystem.

A RAID implementation that is part of the operating system is cheap - you don't have to pay for it (or not much). It isn't worth much either and should be avoided.

RAID implementation in a disk controller uses less processor power and system resources that an operating system based solution. Usually, special device drivers are required for these disk controllers. These types of systems are inexpensive but provide very limited fault-tolerance. Failed disks cannot be replaced while the system is operational.

A RAID implementation that is a separate subsystem from the computer is best. It can have its own power supplies, battery backup, spare controllers, hot-standby disks, and other capabilities all independent of the computer. Failure of a component in the computer should not affect the disk subsystem and vice versa. An excellent example of this type of system is Data General's Clariion disk array. Many other vendors also provide these types of systems.

Go to monograph index

Copyright 1997, Progress Software Corp., All Rights Reserved