**RAID-1 Overheads**(treating RAID-1 and RAID-10 as identical)

N = number of drives mirrored. N=2 for duplicated

G = number of drive-sets in a Volume Group.

\(N \times G\) is the total number of drives in Volume Group.

An array may be composed of many Volume Groups.

*Per-Disk:*

- Effective Capacity
- N=2. \( 1 \div 2 = 50\% \) [duplcated]
- N=3. \(1 \div 3 = 33.3\% \) [triplicated]

- I/O Overheads & scaling
- Capacity Scaling: linear to
*max disks*. - Random
*Read*: \(N \times G \rm\ of\ rawdisk = N \times G \rm\ singledrive = RAID-0\) - Randdom
*Write*: \(1 \times G \rm\ of\ rawdisk = 100\% \rm\ singledrive\) - Streaming
*Read*: \(N \times G \rm\ of\ rawdisk = N \times G \rm\ singledrive = RAID-0\) - Streaming
*Write*: \(1 \times G \rm\ of\ rawdisk = 100\% \rm\ singledrive\)

*RAID Array Overheads*:

*Read*: Nil. 100% of available bandwidth of \(N \times G\) drives, same as RAID-0*Write*: single drive performance per replicant (50%, for N=2)- Total RAID bandwidth increases linearly with scale.
- CPU & RAM: Low: buffering, scheduling, block addrs calculation and error handling.
- Zero Parity calculation.
- Cache needed: zero or small cache needed
- Impact of Caching:
- Random I/O:
- Nil for low locality/readback of blocks
- High impact for high locality/readback of writes
- High in coalescing spread Random I/O to streams
- Streaming write: 50% total available bandwidth (N=2)

*Tolerance to failures and errors*

- For N=2
- Error recovery:
- concurrently, 'read duplicate': 1/2 revolution + seek time to block on alt. drive
- reread drive, 1 revolution, no seek, for "soft error".
- Number of reseeks for "soft" vs "hard" error
- On "hard error", mark block 'bad' and map to a spare block,
- write block copy
- failure of second drive in a matched pair = Data Loss Event
- Up to \(N \div 2\) drive failures
*possible*without Data Loss

*Read Error correction cost*

TBC

**RAID-1 Rebuild**

*Performance Impact of Disk Failure*

- N=2, nominal write IO/sec unaffected
- read IO/sec: for 1/N'th of RAID address space, 50% nominal throughput
- For G = 12, total read bandwidth reduces from \(N \times G = 2 \times 12 = 24\) single drive throughput to \((N \times G) - 1 = (2 \times 12) - 1 = 23\) times = 4% reduction in
- N=3, nominal write IO/sec unaffected
- read IO/sec: for 1/N'th of RAID address space, 66.6% nominal throughput for a single drive-set.
- For G = 12, total read bandwidth reduces from \(3 \times\) 12 = 36 \) to 35, or 2.78%.

*Performance Impact of RAID Rebuild*

- N=2,
- streaming copy of primary drive to spare,
- blocks interrupted, on average, by every Nth access
- time to rebuild affected by Array Utilisation
- For G = 12, 8.3% reduction in read throughput, write throughput same.
- or for distributed spares, streaming reads spread across (N-1) drives, limited by streaming throughput of destination drive
- N=3,
- streaming copy of 2nd drive to spare
- rebuild time consistent and minimum possible
- impact is 33% read performance for 1/N'th of RAID address space
- For G = 12, 5.5% reduction in read throughput, write throughput same.

**RAID-1 Failures (Erasures)**

A 3% AFR (250,000hr MTBF) for 5,000 drives gives 150 failed drives per year, or 3 per week, approx. 1 every 50-hours.

A 2TB drive (16Tbit, or 1.6x10

^{13}bits) at a sustained 1Gbps, will take a minimum 4.5 hours to scan.

For the benchmark configuration, RAID-1 with two drives, we'll have a whole-drive Data Loss event if the source drive fails during the rebuild (via copy) of a failed drive to a spare.

What is the probability of a second drive failure within that time?

\begin{equation}

\begin{split}

P_{fail2nd}& = 4.5 hrs \div 250,000hrs\\

& = 0.000018\\

& = 1.8\times10^{-5}

\end{split}

\end{equation}

Alternatively, how often will a drive rebuild in a single array fail due to a second drive failure?

\begin{equation}

\begin{split}

N_{fail2nd}& = \frac{1}{P_{fail2nd}}\\

& = 250,000hrs \div 4.5 hrs\\

& = \rm 1\ in\ 5,555\rm\ events

\end{split}

\end{equation}

At 150 events/year, this translates to:

\begin{equation}

\begin{split}

Y_{fail2nd}& = \frac{N_{fail2nd}}{150}\\

& =\rm 5,555\ events \div 150 events/year\\

& = \rm\ once\ in\ 370\ years

\end{split}

\end{equation}

If you're an individual owner, that risk is probably acceptable. If you're a vendor with 100,000 units in the field, is that an acceptable rate?

\begin{equation}

\begin{split}

YAgg_{fail2nd}& = \frac{Units}{Y_{fail2nd}}\\

& = 100,000 \div 370\\

& = \rm\ 270\ dual\mathpunct{-}failures\ per\ \it year

\end{split}

\end{equation}

Array Vendors may not be happy with that level of customer data-loss. They can engineer their product to default to using triplicated RAID-1. The Storage Industry has a problem as no standard naming scheme exists to differentiate dual, triple or more mirrors.

\begin{equation}

\begin{split}

P_{fail3rd}& = P_{fail2nd} \times 4.5 hrs \div 250,000hrs\\

& = (0.000018)^2 =(1.8\times10^{-5})^2\\

& = 3.24\times10^{-10}

\end{split}

\end{equation}

\begin{equation}

\begin{split}

N_{fail3rd}& = \frac{1}{P_{fail3rd}}\\

& = 1 \div 3.24\times10^{-10}\\

& = \rm 1\ in\ 3,080,000,000\ (3.08\times 10^9)\ events

\end{split}

\end{equation}

\begin{equation}

\begin{split}

Y_{fail3rd}& = \frac{N_{fail3rd}}{150}\\

& = \rm 3.08\times 10^9\ events \div 150\ events/year\\

& = \rm\ once\ in\ 20,576,066\ (2.0576066\times 10^7)\ years

\end{split}

\end{equation}

\begin{equation}

\begin{split}

YAgg_{fail3rd}& = \frac{Units}{Y_{fail3rd}}\\

& = 100,000 \div 20,576,066\\

& = \rm\ 0.004860\ triple{-}failures\ per\ year\\

& = \rm\ one\ triple{-}failure\ per\ 205.76\ years

\end{split}

\end{equation}

Or, per 100,000 units of 5,000 drives, one triple-failure in 205 years with triplicated RAID-1.

**RAID-1 Errors**

TBC

## No comments:

Post a Comment