Processing math: 100%

Wednesday, June 18, 2014

RAID-1: Errors and Erasures calculations

RAID-1 Overheads (treating RAID-1 and RAID-10 as identical)

N = number of drives mirrored. N=2 for duplicated
G = number of drive-sets in a Volume Group.
N×G is the total number of drives in Volume Group.
An array may be composed of many Volume Groups.

Per-Disk:
  • Effective Capacity
    • N=2. 1÷2=50% [duplcated]
    • N=3. 1÷3=33.3% [triplicated]
  • I/O Overheads & scaling
    • Capacity Scaling: linear to max disks.
    • Random Read: N×G of rawdisk=N×G singledrive=RAID0
    • Randdom Write: 1×G of rawdisk=100% singledrive
    • Streaming Read: N×G of rawdisk=N×G singledrive=RAID0
    • Streaming Write: 1×G of rawdisk=100% singledrive

RAID Array Overheads:
  • Read: Nil. 100% of available bandwidth of N×G drives, same as RAID-0
  • Write: single drive performance per replicant (50%, for N=2)
    • Total RAID bandwidth increases linearly with scale.
  • CPU & RAM: Low: buffering, scheduling, block addrs calculation and error handling.
    • Zero Parity calculation.
  • Cache needed: zero or small cache needed
  • Impact of Caching: 
    • Random I/O:
      • Nil for low locality/readback of blocks
      • High impact for high locality/readback of writes
      • High in coalescing spread Random I/O to streams
    • Streaming write: 50% total available bandwidth (N=2)
Tolerance to failures and errors
  • For N=2
    • Error recovery:
      • concurrently, 'read duplicate': 1/2 revolution + seek time to block on alt. drive
      • reread drive, 1 revolution, no seek, for "soft error".
        • Number of reseeks for "soft" vs "hard" error 
      • On "hard error", mark block 'bad' and map to a spare block,
        • write block copy 
    • failure of second drive in a matched pair = Data Loss Event
    • Up to N÷2 drive failures possible without Data Loss

Read Error correction cost
TBC

RAID-1 Rebuild

Performance Impact of Disk Failure
  • N=2,  nominal write IO/sec unaffected
    • read IO/sec: for 1/N'th of RAID address space, 50% nominal throughput
      • For G = 12, total read bandwidth reduces from N×G=2×12=24 single drive throughput to (N×G)1=(2×12)1=23 times = 4% reduction in 
  • N=3, nominal write IO/sec unaffected
    • read IO/sec: for 1/N'th of RAID address space, 66.6% nominal throughput for a single drive-set.
      • For G = 12, total read bandwidth reduces from 3× 12 = 36 \) to 35, or 2.78%.
Performance Impact of RAID Rebuild
  • N=2,
    • streaming copy of primary drive to spare,
      • blocks interrupted, on average, by every Nth access
      • time to rebuild affected by Array Utilisation
      • For G = 12, 8.3% reduction in read throughput, write throughput same.
    • or for distributed spares, streaming reads spread across (N-1) drives, limited by streaming throughput of destination drive
  • N=3,
    • streaming copy of 2nd drive to spare
      • rebuild time consistent and minimum possible
      • impact is 33% read performance for 1/N'th of RAID address space
        • For G = 12, 5.5% reduction in read throughput, write throughput same.


RAID-1 Failures (Erasures)

A 3% AFR (250,000hr MTBF) for 5,000 drives gives 150 failed drives per year, or 3 per week, approx. 1 every 50-hours.

A 2TB drive (16Tbit, or 1.6x1013 bits) at a sustained 1Gbps, will take a minimum 4.5 hours to scan.

For the benchmark configuration, RAID-1 with two drives, we'll have a whole-drive Data Loss event if the source drive fails during the rebuild (via copy) of a failed drive to a spare.

What is the probability of a second drive failure within that time?
Pfail2nd=4.5hrs÷250,000hrs=0.000018=1.8×105
Alternatively, how often will a drive rebuild in a single array fail due to a second drive failure?
Nfail2nd=1Pfail2nd=250,000hrs÷4.5hrs=1 in 5,555 events
At 150 events/year, this translates to:
Yfail2nd=Nfail2nd150=5,555 events÷150events/year= once in 370 years
If you're an individual owner, that risk is probably acceptable. If you're a vendor with 100,000 units in the field, is that an acceptable rate?
YAggfail2nd=UnitsYfail2nd=100,000÷370= 270 dualfailures per year

Array Vendors may not be happy with that level of customer data-loss. They can engineer their product to default to using triplicated RAID-1. The Storage Industry has a problem as no standard naming scheme exists to differentiate dual, triple or more mirrors.
Pfail3rd=Pfail2nd×4.5hrs÷250,000hrs=(0.000018)2=(1.8×105)2=3.24×1010
Nfail3rd=1Pfail3rd=1÷3.24×1010=1 in 3,080,000,000 (3.08×109) events
Yfail3rd=Nfail3rd150=3.08×109 events÷150 events/year= once in 20,576,066 (2.0576066×107) years
YAggfail3rd=UnitsYfail3rd=100,000÷20,576,066= 0.004860 triplefailures per year= one triplefailure per 205.76 years
Or, per 100,000 units of 5,000 drives, one triple-failure in 205 years with triplicated RAID-1.




RAID-1 Errors

TBC

No comments: