Monday, April 21, 2014

Storage: Spares and Parity in large disk collections

What approaches are available to deal with spare drives and RAID parity for 300-1,000 drives in a single box?
Will existing models scale well?
Do other technologies fill any gaps?
There are three main variants to holding RAID Parity and four variants for spare drives in a Protected Data solution. JBOD solutions with no Data Protection are outside the scope of this piece.

The meta-solution of dual- or triple-stores needs no spares and no provision for failed drives, just per-drive error correction.

There are three general architectures used here as the context for organising large sets of disks:
  • single controller, one or more RAID groups (a 'backblaze' capacity-optimised configuration)
  • mid-scale, single main controller, multiple (single-ported) embedded RAID controllers with  internal access fabric, non-switching
  • high-end, dual main controllers, multiple dual-ported embedded RAID controllers, switching access fabric.
Spare drives & RAID parity can be avoided by using RAID 1, either at the drive or RAID-group level.
RAID 1 allows multiple replicas. For some applications, higher streaming and IO/sec can be supported by using large counts of replicas, up to the entire drive set. For these arrangements, the block mapping of logical to physical blocks on individual drives can be varied, placing a set of blocks in the outer ring of that drive giving better access times ('short-stroking'). The block scheduler can preferentially direct reads of those blocks to those drives.

Drive to embedded RAID controller mapping.
Between 4 and 12 2.5" drives mounted along a carrier, or Sled. Carriers may be single or double sided.
  • There may be a 1:1 or 1:M connection of carriers to an embedded RAID controller, allowing the RAID controller chip to be mounted on the carrier with the drives and share a common power supply. This is the simplest arrangement electrically, with fewest removable connectors, or
  • orthogonal mounting of drives and RAID controllers, allowing larger RAID groups and reducing count of secondary controllers:
    • the 'Nth' drive of each in a set of carriers are connected to the same RAID controller
    • A single carrier can be removed or have a common mode failure during which the RAID controller will compensate for with reconstructed parity data.
    • This requires many removable connectors, increasing sources of failure & errors.
Spare drives can be:
  • No hot spares. "Break/Fix" replacement. Increased chance of Data Loss events. [need to quantify]
  • 1 or more global hot-spares managed by main controller.
  • 1 hot-spare per embedded RAID controller
  • complete RAID-set spares.
    • When a single drive fails in a RAID group, all data is streamed to a new, unused RAID set.
    • The old RAID-group can be rebuilt as a single smaller RAID-group without the failed drive.
RAID Parity:
  • RAID parity can be stored entirely within a RAID-group, managed by a single controller.
    • With an orthogonal drive/controller mapping, larger RAID-groups are possible.
  • RAID parity can be stored across controllers in large RAID groups, requiring inter-controller traffic for all operations, or even
    • One subset of this is for RAID 3/4, collecting all parity devices onto specially allocated
  • RAID parity can be stored on ancillary devices, SSD's or even in battery-backed DRAM.

Recent posts:

No comments: