Wednesday, May 6, 2009

Not all MAID (Massive Array of Idle Disks) is created equal

Over the past few months I have been involved in a number of conversations regarding MAID and spin down technologies. I continue to be surprised at the lack of understanding that existed regarding the two technologies so I decided to update this post which was originally published last august to provide some insight and hopefully some wisdom on what is and what is not MAID.

The concept of MAID technology is the brainchild of a research team from the University of Colorado who hypothesized that MAID (massive array of idle disks) would be a storage structure that would deliver the density of tape with the performance similar to that of disk and with a very small power envelope.[1]SNIA Definition - “MAID, a storage system comprising of a massive array of (idle) disk drives that are powered down individually or in groups when not required.” [2]The motivation for such an architecture was to deliver a solution that exploited relatively inexpensive SATA disk technology to create a commercially viable, enterprise class, mass storage solution that exhibited much of the access performance and data integrity characteristics of a disk array but with the economics of a tape library. The sweet spot for this technology, and where it will deliver the most benefit, is in the storage and management of persistent data that is, infrequently accessed data (low IOP’s), data that will rarely if ever be changed, but data that is serving applications that need faster access to individual files than magnetic tape can deliver. Adding to the sweet spot characteristics is high energy efficiency.But not all “MAID” labeled solutions are created equal. Do not confuse MAID with spin down features that are added to conventional disk array architectures.

So what are the key characteristics of a MAID Solution that drive user benefit and what are the unique design characteristics that are not noticeable in traditional array design.

  • Very high drive packaging density.The fact that in a MAID architecture the number of drives that can spin at any one time is limited. This allows extremely dense packaging not possible in conventional architectures. Example, a single COPAN frame can support up to 896 drives while the EMC Infiniflex 10000 aka Hulk tops out at 300 drives and DataDirect Networks S2A Storage Scalar at 600 drives per rack. Greater storage density delivers floor space savings, elimination of aging technology through consolidation and the opportunity to reallocate more expensive storage to more appropriate use and potentially delaying an expensive purchase.
  • The number of drives that can spin at any one time is limited and will not exceed 50% of the total number of drives installed.This is the original definition as presented by the University of Colorado researchers. COPAN currently limits this number to 25%. The key being that not all drives spin at any one time reducing maximum power requirements, reduced heat generation which in turn reduces the necessary cooling infrastructure and aides in the elimination of rotational vibration issues.
  • The power available in the Cabinet will not support all drives spinning at any one time.Limited power budget drives power efficiencies and prevents any misguided attempt to power up all drives.
  • The component count, power supplies, power converters, fans etc will be significantly less than traditional architectures.Reduced component count should equate to a less cost and improved system reliability. However only actual reliability data will confirm how well theory performs when reduced to practice.
  • Access to data on drives that are powered down will be 15 seconds or greater. Longer than expected from a traditional disk subsystem but significantly less than off board tape.Applications must be MAID aware. A request for data on a powered down LUN will experience a delayed response back to the application and if the application is not MAID aware the delay may trigger a time out or recovery action. Applications must be MAID aware.
  • To meet data center, enterprise class expectations the solution should have embedded data and device integrity checking and self healing capabilities.The value and usefulness of data lives much longer after its creation and its initial period of activity. Corporate governance and government compliance regulations are causing data to be stored for increasing longer periods of time and it is this accretive process that is fueling the explosive data growth issue. Any solution that targets the storage of long term data must be architected to ensure the integrity and availability of the data when requested. The more automated the processes the better.

What is not a characteristic of a MAID solution is the increasingly common feature of drive spin down. This is an approach which does not completely remove power from the drives but does still deliver significant energy savings. Although drive power down has been in the SCSI command set for some time the practical implementation of drive spin down to manage power efficiency was a result of additional innovation by the drive manufacturers, not the array vendor.

Drive manufacturers such as Hitachi introduced multiple powered down states. This was first introduced for laptops, and now extended into enterprise SATA drives used in enterprise storage arrays. For example, Hitachi allows a drive to be in one of four power states:

  • Level 0:
    Normal operation at 7,200 rpm with heads loaded (un-parked)
  • level 1:
    Heads Unloaded (parked, reduces wind resistance on heads)
    15% to 20% power savings
    Sub-second recovery time
  • level 2:
    Heads Unloaded,
    Slows to 4000 rpm
    35% to 45% power savings
    15 second recovery time
  • level 3:
    Stops spinning (sleep mode; powered on)
    60% to 70% savings
    30 to 45 second recovery time

Seagate has a SATA drive that allows the drive just to be powered off (level 3), and Western Digital has a SATA drive so-called “Green Drive” that revolves slower (5,400rpm) and can also park the heads (level 1).

To the purest spin down is a compromise approach enabling vendors to add this feature to existing array architectures. It is better classified as a power efficiency feature that will deliver 25% to 60% of the power savings possible from a MAID implementation. Companies who offer a spin down option include Fujitsu, HDS, NEC, DataDirect Networks, Xiotech, Xyratex, Nexsan, Greenbytes and EMC with NetApp, Pillar Data and others promising a future deliverable.

A broader discussion of MAID and its relevance in the data center is presented in the white paper “ Defining MAID - (Massive Array of Idle disk). This paper can be accessed on my web site

[1] The Case for Massive Array of Idle Disks (MAID): Dennis Colarelli, Dirk Grunwald and Michael Neufeld, Dept of Computer Science, University of Colorado, Boulder. January 7th, 2002.[2] The Dictionary of Storage Networking Technology, Storage Networking Industry Association (SNIA), 2005/2006.[3] Persistent Data Storage Architecture: COPAN Systems, September, 2006
Labels: automaid, copan systems, data center energy efficiency, data management, data storage, disk storage, eco mode, maid storage, maid technology, persistent data, spin down