HIGH PERFORMANCE I/OS

Disk Hierarchy

  • The DCD Project: Disk Caching Disk (DCD) is a novel disk storage architecture developed in HPCL for the purpose of optimizing I/O performance. The main idea of the DCD is to use a small log disk, referred to as cache-disk, as a secondary disk cache to optimize write performance. While the cache-disk and the normal data disk have the same physical properties, the access speed of the former differs dramatically from the latter because of different data units and different ways in which data are accessed. Our objective is to exploit this speed difference by using the log disk as a cache to build a reliable and smooth disk hierarchy. The log disk caches disk writes reliably and efficiently because seek time and rotation time are eliminated for most disk operations on the log disk. A small RAM buffer is used to collect small write requests to form a log which is transferred onto the cache-disk whenever the cache-disk is idle. Because of the temporal locality that exists in office/engineering work-load environments, the DCD system shows write performance close to the same size RAM (i.e. solid- state disk) for the cost of a disk. Moreover, the cache-disk can also be implemented as a logical disk in which case a small portion of the normal data disk is used as the log disk. Trace-driven simulation experiments are carried out to evaluate the performance of the proposed disk architecture. Under the office/engineering work-load environment, the DCD shows superb disk performance for writes as compared to existing disk systems. Performance improvements of up to two orders of magnitude are observed in terms of average response time for write operations. Furthermore, DCD is very reliable and works at the device or device driver level. As a result, it can be applied directly to current file systems without the need of changing the operating system. DCD is a new hard disk drive structure that improves write performance of current state-of-the-art disks by 200 times, YES, 200 times not 200%! The US patent for this new technology was approved by the U.S. Patent and Trademark Office, No. 5,754,888 in September 1997.
  • RAPID CACHE: Modern RAID systems make extensive use of non-volatile RAM (NVRAM) write caches to mask the effects of the small write penalty. A single-copy NVRAM cache creates a single point of failure in a highly reliable RAID system while a dual-copy NVRAM cache is very expensive because of the high cost of NVRAM. This paper presents a new cache architecture for RAID systems called RAPID-Cache for Redundant, Asymmetrically Parallel, and Inexpensive Disk Cache . A typical RAPID-Cache consists of two redundant write buffers on top of a RAID. One of the buffers is a primary cache made of RAM or NVRAM and the other is a backup cache containing a two level hierarchy: a small NVRAM buffer on top of a log disk. The small NVRAM buffer combines small write data and writes them into the log disk in large sizes. By exploiting the locality property of I/O accesses and taking advantages of Log-structured File Systems, the backup cache has nearly equivalent write performance as the primary RAM cache. The read performance of the backup cache is not as critical because normal read operations are performed through the primary RAM cache and reads from the backup cache happen only during error recovery periods. The RAPID-Cache presents an asymmetric architecture with a fast-write-fast-read RAM being a primary cache and a fast-write-slow-read NVRAM-disk hierarchy being a backup cache. Such an asymmetric cache allows cost-effective designs for very large write caches to completely mask the effects of small write penalty for high-end RAID systems that would otherwise have to use dual-copy, costly NVRAM caches. It also makes it possible to implement reliable write caching for low-end RAID systems since the RAPID-Cache makes use of inexpensive disks to perform reliable caching. Four different configurations of the RAPID-Cache are studied in detail in this paper by means of simulations and analytical models. Our results show that the RAPID-Cache has significant reliability/cost advantages over conventional single NVRAM write caches and has great cost advantages over dual-copy NVRAM caches. The RAPID-Cache architecture opens a new dimension for RAID system designers to exercise trade-offs among performance, reliability and cost. Cache is a new disk cache structure for RAID systems. It reduces system cost, increases throughput and reliability of RAID systems. U.S. Patent and Trademark Office, No. 6,243,795, June 5th, 2001.
  • STICS—–SCSI-To-IP Cache Storage” STICS (SCSI-To-IP Cache Storage) is a novel storage architecture that couples reliable and high-speed data caching with low-overhead conversion between SCSI and IP protocols. A STICS block consists of one or several storage devices such as disks or RAID, and an intelligent processing unit with CPU and RAM. The storage devices are used to cache and store data while the intelligent processing unit carries out caching algorithm, protocol conversion, and self-management functions. Through the efficient caching algorithm and localization of certain unnecessary protocol overheads, STICS can significantly improve performance, reliability, and scalability over current iSCSI systems. Furthermore, STICS can be used as a basic plug-and-play building block for data storage over IP. Analogous to “cache memory” invented several decades ago for bridging the speed gap between CPU and memory, STICS is the first-ever “cache storage” for bridging the gap between SCSI and IP making it possible to build efficient SAN over IP. We have implemented software STICS prototype on Linux operating system. Numerical results using popular PostMark benchmark program and EMC’s trace have shown dramatic performance gain over the current iSCSI implementation.