Introduction
Customers often ask if the Progress database
supports some specific exotic storage device, especially
devices that attempt to improve the effective I/O rate of a
disk device by using write caching techniques. Among these
are various types of RAID subsystems and solid state disks.
This article provides a basic description of how such
devices use caches and makes recommendations regarding their
use with the Progress database.
Why You Need To Know This:
Many modern disk storage subsystems have the
capability to store data in a volatile high-speed memory -
the cache - and then later transfer the data onto disk. The
data may be transferred to disk long after a program has
performed a write operation. Cache memories vary in size and
can be quite large - many megabytes. When power is lost or
turned off the contents of the cache will be lost unless the
disk subsystem has been specifically designed to handle such
situations. If it has not,
unrecoverable file corruption and data loss can
occur.
How the Progress Database Writes to Disk
A key requirement of the Progress database is
the ability to perform reliable disk I/O operations. This is
especially important when writing to the transaction logs
(BI or AI logs) used for crash recovery and transaction
durability. When the Progress database writes log data to
disk, it assumes that when the write operation is complete,
the data have been recorded in nonvolatile storage. In other
words, that they have actually been written someplace where
they won't be lost. In addition, the database manager
performs certain write operations in a specific order to
ensure that the database on disk is in a recoverable state.
For more detailed information on this, please see the
monograph "Transaction Logging Concepts".
Why Caches Are Used In Disk Storage Subsystems
There are a variety of exotic storage devices
available in the marketplace that provide various benefits
in performance and reliability over ordinary single-spindle
disk drives. The most common are disk arrays, which can be
configured in a number of different ways. Disk arrays arre
also called RAID devices (Redundant Arrays of Inexpensive
Disks).
One of the design goals of disk arrays is to eliminate
the single point of failure presented by a simple disk
drive. This can be accomplished by mirroring information
onto a second drive, or by using more complicated
configurations using multiple disks. For a given amount of
data, simple mirroring requires double the storage space and
roughly doubles the cost. Other solutions can require as
little as 10 % additional storage space.
While the more complex solutions achieve high reliability
and use less total storage space than full mirroring, they
do not have the same performance characteristics. For
example, there is a cost to calculate and write the
additional information used for error recovery. This so
called "write penalty" problem is often solved by adding a
write cache to increase overall performance. Some disk
controllers and single-spindle disk drives also have write
caches.
Write caches are usually implemented with high-speed
semiconductor memory whose contents will be lost when power
to the memory is turned off. However, by using redundant
power supplies and battery backup devices, manufacturers can
create disk storage systems that are reliable. They will not
lose data when power is lost or when other types of failures
occur.
Possible Consequences
Whenever devices like RAID disk arrays,
caching controllers, semiconductor disks, or disk drives
with integrated write caching are used, a system failure can
cause significant data loss. In many
cases, the only means of recovery will be to restore the
database from the most recent backup. But this does
not have to happen to you.
If you are considering using one of these exotic storage
devices, especially if you are putting a recovery log on
them, you should satisfy yourself that the storage device is
reliable. The hardware or operating system vendor must
convince you of this. To the Progress database manager,
these devices appear to be ordinary disk drives with no
special properties.
The Progress database does not detect when such devices
are being used. They appear to the operating system and to
Progress as ordinary disk devices. Because of the large
variety of devices available and the number of vendors,
Progress does not certify its products with specific storage
devices. Progress does occasionally work with vendors to
understand how their devices operate and explain the
operational requirements of the Progress database as part of
the normal product development process.
Some Experiences
In early 1997, Progress assisted in work
performed by Data General Corporation to demonstrate the
correct crash and power failure recovery operation of a
system that used a Clariion Disk Array equipped with write
cache and battery backup options while running a
Progress-based workload. Progress considers the Clariion
technology to be a good example of a RAID storage device
available in the marketplace. That specific effort verified
that database writes that were in the write cache when
failures occurred were completed successfully once the
storage device was provided full power. The demonstration
showed that the Clariion Disk Array was able to survive the
failure of a single disk drive, and further, showed that
even when the storage device was partially disassembled and
reassembled, the array was able to complete writes as long
as the write cache remained intact. For that specific
demonstration the Clariion Disk Array was reliable, and
provided a reasonable solution to negating the write penalty
associated with using advanced RAID technologies.
Another example is the storage sybsystems produced by
EMC. Progress has participated in several successful
benchmarks and tests of systems which included EMC Symmetry
storage subsystems. Several Progress customers are using EMC
systems succesfully.
Recommendations
You should not place database data,
especially transaction logs, on storage devices that are not
adequately protected from inevitable software, hardware, and
power failures. Storage devices that contain write caches
are especially susceptible to data loss from power failure
unless the manufacturer has taken measures to prevent this.
Progress Software Corporation does not currently certify any
specific storage devices to be used with the Progress
database. However, some storage system vendors have
demonstrated that their products are safe and effective.
If you use exotic storage devices to increase performance
or reliability of your systems, you should work with the
device vendor to convince yourself that the device, like any
other component of the system, will provide the level of
reliability you need. You should buy only from reputable
vendors who can demonstrate that their products meet your
reliability objectives. High performance is worth nothing if
reliability is low.
Do not try to achieve reliability by using only a UPS
(uninterruptible power supply) for your entire computer
system. While this is helpful, it is insufficient. A ups
will not prevent data loss when a disk controller must be
removed from a computer.
If you feel unable to make a correct decision, hire
someone to assist you.
Additional Information
You can find more information on RAID storage
subsystems in the article entitled
"Raiders of The Lost
Disk".
You can find more information on Progress's crash
recovery mechanisms and disk i/o in the monograph entitled
"Transaction
Logging Concepts".
|