What would a current attempt at MAID look like with 2.5" drives?
"MAID", Massive Array of Idle Disks, was an attempt by Copan Systems (bought by SGI in 2009) at near-line Bulk Storage. It had a novel design innovation, mounting drives vertically back-to-back in slide-out canisters (patented), and was based on an interesting design principle: off-line storage can mostly be powered down.
It was a credible attempt, coming out of The Internet Archive, and their "Petabox" (a more technical view and on Wikipedia). At 24 x 3.5" drives per 4RU, they contain around half the 45 drives of the Backblaze 4.0 Storage Pod. The Petabox has 10Gbps uplinks and much beefier CPU's and more DRAM.
The Xyratex ClusterStor (now Seagate) offers another benchmark: their Scalable Storage Unit (SSU) stores 3 rows of 14 drives in 2.5RU x 450mm slide-out draws, allowing hot-plug access to all drives. Two SSU's comprise a single 5RU unit of 84 drives, with up to 14 SSU's per rack for 1176 drives per rack, an average of 28 x 3.5" drives per Rack Unit.
My "Laboratory Note Book" on a Miscellanea of Topics.
If I believe I.T. isn't a "professional discipline" and two of the missing elements are "Lab Note Books" and "Robust Critique" (as in the Academic sense of Robust Defence) - then I've got to do as I say...
Index
▼
Tuesday, May 27, 2014
Sunday, May 04, 2014
RAID-0 and RAID-3/4 Spares
This piece is not based on an exhaustive search of the literature. It addresses a problem that doesn't seem to have been addressed as RAID-0 and the related RAID-3/4, a single parity drive.
Single parity drives seem to be deemed early on to be impractical because it apparently comprises a deliberate system bottleneck. RAID-3/4 has no bottleneck for streaming reads/writes and for writes, performance becomes, not approaches, the raw write performance of the array is available, identical to RAID-0 (stripe). For random writes, the 100-150 times speed differential between sequential and random access of modern drives can be leveraged with a suitable buffer to remove the bottleneck. The larger the buffer, the more likely the pre-read of data, to save to calculate the new parity, won't be needed. This triples the array throughput by avoiding the full revolution forced by the read/write-back cycle.
Multiple copies of the parity drive (RAID-1) can be kept to mitigate against the very costly failure of a parity drive: all blocks on every drive must be reread to recreate a failed parity drive. For large RAID groups and the very low price of small drives, this is not expensive.
With the availability of affordable, large SSD's, naive management of a single parity drive also removes the bottleneck for quite large RAID groups. The SSD can be backed by a log-structured recovery drive, trading on-line random IO performance for rebuild time.
Designing Local and/or Global Spares for large (N=64..512) RAID sets is necessary to reduce overhead, improve reconstruction times and avoid unnecessary partitioning, limiting recovery options and causing avoidable data loss events.
Single parity drives seem to be deemed early on to be impractical because it apparently comprises a deliberate system bottleneck. RAID-3/4 has no bottleneck for streaming reads/writes and for writes, performance becomes, not approaches, the raw write performance of the array is available, identical to RAID-0 (stripe). For random writes, the 100-150 times speed differential between sequential and random access of modern drives can be leveraged with a suitable buffer to remove the bottleneck. The larger the buffer, the more likely the pre-read of data, to save to calculate the new parity, won't be needed. This triples the array throughput by avoiding the full revolution forced by the read/write-back cycle.
Multiple copies of the parity drive (RAID-1) can be kept to mitigate against the very costly failure of a parity drive: all blocks on every drive must be reread to recreate a failed parity drive. For large RAID groups and the very low price of small drives, this is not expensive.
With the availability of affordable, large SSD's, naive management of a single parity drive also removes the bottleneck for quite large RAID groups. The SSD can be backed by a log-structured recovery drive, trading on-line random IO performance for rebuild time.
Designing Local and/or Global Spares for large (N=64..512) RAID sets is necessary to reduce overhead, improve reconstruction times and avoid unnecessary partitioning, limiting recovery options and causing avoidable data loss events.
Saturday, May 03, 2014
Comparing consumer drives in small-systems RAID
This stems from an email conversation with a friend: why would he be interested in using 2.5" drives in RAID, not 3.5"?
There are two key questions for Admins at this scale, and my friend was exceedingly sceptical of my suggestion:
There are two key questions for Admins at this scale, and my friend was exceedingly sceptical of my suggestion:
- Cost/GB
- 'performance' of 2.5" 5400RPM drives vs 7200RPM drives.
Retail Disk Prices, as printed
Table of current retail prices for various types of disk with cost-per-GB.
Disclaimer: This table is for my own point-in-time reference, does not carry any implicit or explicit recommendations or endorsement for the retailer, vendor or technologies.
Disclaimer: This table is for my own point-in-time reference, does not carry any implicit or explicit recommendations or endorsement for the retailer, vendor or technologies.
Retail disk prices, sorted.
Table of current retail prices for various types of disk, sorted on cost-per-GB.
Disclaimer: This table is for my own point-in-time reference, does not carry any implicit or explicit recommendations or endorsement for the retailer, vendor or technologies.
3.5" Internal drives are the cheapest $/GB, ranging from 4.3 cents/GB to 10-11 cents/GB. Generally, larger drives have cheaper $/GB. Higher spec drives, suitable for high duty-cycle applications, are more expensive. This retailer doesn't sell 10K or SAS drives.
It's not possible to track 3.5" drives from Internal to External to arrive at a cost of packaging.
2.5" Internal drives range 8 to 16.5 cents/GB, generally higher than 3.5" drive costs. There seems to be little extra cost of packaging for external drives. There is a small premium in consumer drives for 7200RPM. This retailer only sells 2TB drives (15mm vs 9.5mm?) as external drives.
There was no information in the retailers rather compact format on the thickness (5mm, 7mm, 9.5mm, 12.5mm, 15mm) of 2.5" drives.
Solid State Disks are 5+ times more expensive than Hard Disk Drives, at 59 cents/GB to $1.37/GB.
The smaller mSATA drives start at 72.8 cents/GB.
No supplier information on SSD specs are included: SLC/MLC, transfer rates, IO/sec and number of write cycles. SSD's are very sensitive to wear and device selection requires very careful reading of device specifications.
Disclaimer: This table is for my own point-in-time reference, does not carry any implicit or explicit recommendations or endorsement for the retailer, vendor or technologies.
3.5" Internal drives are the cheapest $/GB, ranging from 4.3 cents/GB to 10-11 cents/GB. Generally, larger drives have cheaper $/GB. Higher spec drives, suitable for high duty-cycle applications, are more expensive. This retailer doesn't sell 10K or SAS drives.
It's not possible to track 3.5" drives from Internal to External to arrive at a cost of packaging.
2.5" Internal drives range 8 to 16.5 cents/GB, generally higher than 3.5" drive costs. There seems to be little extra cost of packaging for external drives. There is a small premium in consumer drives for 7200RPM. This retailer only sells 2TB drives (15mm vs 9.5mm?) as external drives.
There was no information in the retailers rather compact format on the thickness (5mm, 7mm, 9.5mm, 12.5mm, 15mm) of 2.5" drives.
Solid State Disks are 5+ times more expensive than Hard Disk Drives, at 59 cents/GB to $1.37/GB.
The smaller mSATA drives start at 72.8 cents/GB.
No supplier information on SSD specs are included: SLC/MLC, transfer rates, IO/sec and number of write cycles. SSD's are very sensitive to wear and device selection requires very careful reading of device specifications.
01-May-2014 | |||||||||
---|---|---|---|---|---|---|---|---|---|
http://www.msy.com.au/Parts/PARTS.pdf | |||||||||
F-Fac | Typ | Dsk | Conn | RPM | Brand/Model | Cap | Cost | $/GB | GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Green EZRX | 3TB | 129 | 0.0430 | 3000GB |
3.5" | Int | HDD | SATA3 | 7200? | Seagate | 3TB | 129 | 0.0430 | 3000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Green EZRX | 4TB | 185 | 0.0462 | 4000GB |
3.5" | Int | HDD | SATA3 | 7200? | Seagate | 4TB | 189 | 0.0473 | 4000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Green EZRX | 2TB | 95 | 0.0475 | 2000GB |
3.5" | Int | HDD | SATA3 | 7200? | Seagate | 2TB | 95 | 0.0475 | 2000GB |
3.5" | Int | HDD | SATA3 | 7200? | Seagate NAS | 3TB | 160 | 0.0533 | 3000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Red NAS EFRX | 3TB | 165 | 0.0550 | 3000GB |
3.5" | Int | HDD | SATA3 | 7200? | Seagate NAS | 4TB | 229 | 0.0573 | 4000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Red NAS EFRX | 4TB | 235 | 0.0587 | 4000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Purple PURX Surveillance | 3TB | 179 | 0.0597 | 3000GB |
3.5" | Int | HDD | SATA? | 7200? | Hitachi HGST NAS | 3TB | 179 | 0.0597 | 3000GB |
3.5" | Int | HDD | SATA? | 7200? | Hitachi HGST NAS | 4TB | 249 | 0.0622 | 4000GB |
3.5" | Int | HDD | SATA3 | 7200? | Seagate NAS | 2TB | 125 | 0.0625 | 2000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Green EZRX | 1TB | 64 | 0.0640 | 1000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Red NAS EFRX | 2TB | 129 | 0.0645 | 2000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Purple PURX Surveillance | 4TB | 259 | 0.0648 | 4000GB |
3.5" | Int | HDD | SATA3 | 7200? | Seagate | 1TB | 65 | 0.0650 | 1000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Purple PURX Surveillance | 2TB | 135 | 0.0675 | 2000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Red NAS EFRX | 1TB | 89 | 0.0890 | 1000GB |
3.5" | Int | HDD | SATA2 | 7200? | Hitachi HGST UltraStar | 1TB | 89 | 0.0890 | 1000GB |
3.5" | Int | HDD | SATA3 | 7200? | Hitachi HGST | 3TB | 270 | 0.0900 | 3000GB |
3.5" | Int | HDD | SATA3 | 7200? | Hitachi HGST | 4TB | 365 | 0.0912 | 4000GB |
3.5" | Int | HDD | SATA3 | 7200? | Hitachi HGST | 2TB | 185 | 0.0925 | 2000GB |
3.5" | Int | HDD | SATA3 | 7200? | WD Purple PURX Surveillance | 1TB | 95 | 0.0950 | 1000GB |
3.5" | Int | HDD | SATA3 | 7200? | Seagate | 500G | 55 | 0.1100 | 500GB |
3.5" | Ext | HDD | USB3.0 | 7200? | WD Element | 3TB | 129 | 0.0430 | 3000GB |
3.5" | Ext | HDD | USB3.0 | 7200? | Seagate Expansion | 3TB | 139 | 0.0463 | 3000GB |
3.5" | Ext | HDD | USB3.0 | 7200? | Seagate Expansion | 2TB | 95 | 0.0475 | 2000GB |
3.5" | Ext | HDD | USB3.0 | 7200? | WD Mybook Essential | 3TB | 149 | 0.0497 | 3000GB |
3.5" | Ext | HDD | USB3.0 | 7200? | WD Mybook Essential | 4TB | 209 | 0.0522 | 4000GB |
3.5" | Ext | HDD | USB3.0 | 7200? | Seagate BackUp Plus | 3TB | 159 | 0.0530 | 3000GB |
3.5" | Ext | HDD | USB3.0 | 7200? | Seagate BackUp Plus | 2TB | 115 | 0.0575 | 2000GB |
3.5" | Ext | HDD | USB3.0 | 7200? | WD Mybook Essential | 2TB | 139 | 0.0695 | 2000GB |
2.5" | Int | HDD | SATA? | 5400 | Hitachi HGST | 1TB | 80 | 0.0800 | 1000GB |
2.5" | Int | HDD | SATA? | 5400 | WD JPVX | 1TB | 83 | 0.0830 | 1000GB |
2.5" | Int | HDD | SATA? | 5400 | WD BPVX | 750G | 64 | 0.0853 | 750GB |
2.5" | Int | HDD | SATA? | 5400 | Hitachi HGST | 750G | 66 | 0.0880 | 750GB |
2.5" | Int | HDD | SATA? | 5400 | Hitachi HGST | 1.5TB | 139 | 0.0927 | 1500GB |
2.5" | Int | HDD | SATA? | 7200 | Hitachi HGST | 1TB | 93 | 0.0930 | 1000GB |
2.5" | Int | HDD | SATA? | 7200 | WD BPKX | 750G | 78 | 0.1040 | 750GB |
2.5" | Int | HDD | SATA? | 5400 | Hitachi HGST | 500G | 55 | 0.1100 | 500GB |
2.5" | Int | HDD | SATA? | 5400 | Seagate | 500G | 56 | 0.1120 | 500GB |
2.5" | Int | HDD | SATA? | 7200 | Hitachi HGST | 750G | 85 | 0.1133 | 750GB |
2.5" | Int | HDD | SATA? | 5400 | WD LPVX | 500G | 57 | 0.1140 | 500GB |
2.5" | Int | HDD | SATA? | 7200 | Hitachi HGST | 500G | 64 | 0.1280 | 500GB |
2.5" | Int | HDD | SATA? | 7200 | Seagate | 500G | 64 | 0.1280 | 500GB |
2.5" | Int | HDD | SATA? | 7200 | WD BPKX | 500G | 67 | 0.1340 | 500GB |
2.5" | Int | HDD | SATA? | 5400 | Hitachi HGST | 320G | 53 | 0.1656 | 320GB |
2.5" | Int | HDD | SATA? | 5400 | WD LPVX | 320G | 53 | 0.1656 | 320GB |
2.5" | Int | HDD | SATA? | 5400 | Seagate | 320G | 53 | 0.1656 | 320GB |
2.5" | Ext | HDD | USB3.0 | 5400? | WD Element | 2TB | 149 | 0.0745 | 2000GB |
2.5" | Ext | HDD | USB3.0 | 5400? | Samsung | 2TB | 149 | 0.0745 | 2000GB |
2.5" | Ext | HDD | USB3.0 | 5400? | Samsung | 1.5TB | 115 | 0.0767 | 1500GB |
2.5" | Ext | HDD | USB3.0 | 5400? | WD Passport | 2TB | 159 | 0.0795 | 2000GB |
2.5" | Ext | HDD | USB?.0 | 5400? | Hitachi HGST Touro Mobile | 1TB | 80 | 0.0800 | 1000GB |
2.5" | Ext | HDD | USB3.0 | 5400? | WD Passport Ultra | 2TB | 165 | 0.0825 | 2000GB |
2.5" | Ext | HDD | USB3.0 | 5400? | WD Passport | 1.5TB | 129 | 0.0860 | 1500GB |
2.5" | Ext | HDD | USB3.0 | 5400? | Samsung | 1TB | 86 | 0.0860 | 1000GB |
2.5" | Ext | HDD | USB3.0 | 5400? | WD Element | 1TB | 89 | 0.0890 | 1000GB |
2.5" | Ext | HDD | USB3.0 | 5400? | WD Passport | 1TB | 89 | 0.0890 | 1000GB |
2.5" | Ext | HDD | USB?.0 | 5400? | Hitachi HGST Touro Pro | 1TB | 92 | 0.0920 | 1000GB |
2.5" | Ext | HDD | USB3.0 | 5400? | Seagate BackUp Plus | 1TB | 99 | 0.0990 | 1000GB |
2.5" | Ext | HDD | USB3.0 | 5400? | WD Passport Ultra | 1TB | 104 | 0.1040 | 1000GB |
2.5" | Ext | HDD | USB?.0 | 5400? | Hitachi HGST Touro Mobile | 500G | 56 | 0.1120 | 500GB |
2.5" | Ext | HDD | USB3.0 | 5400? | Samsung | 500G | 62 | 0.1240 | 500GB |
2.5" | Ext | HDD | USB3.0 | 5400? | Seagate Expansion | 500G | 69 | 0.1380 | 500GB |
2.5" | Ext | HDD | USB3.0 | 5400? | WD Passport Ultra | 500G | 74 | 0.1480 | 500GB |
2.5" | Ext | HDD | USB3.0 | 5400? | Seagate BackUp Plus | 500G | 88 | 0.1760 | 500GB |
2.5" | Int | SSD | SATA3 | - | Samsung 840 EVO | 1TB | 589 | 0.5890 | 1000GB |
2.5" | Int | SSD | SATA? | - | SanDisk Ultra Plus | 256G | 157 | 0.6133 | 256GB |
2.5" | Int | SSD | SATA? | - | Seagate 600 | 480G | 299 | 0.6229 | 480GB |
2.5" | Int | SSD | SATA? | - | Plextor M5-PRO | 512G | 329 | 0.6426 | 512GB |
2.5" | Int | SSD | SATA? | - | Plextor M5S | 256G | 168 | 0.6562 | 256GB |
2.5" | Int | SSD | SATA3 | - | Samsung 840 EVO | 500G | 329 | 0.6580 | 500GB |
2.5" | Int | SSD | SATA? | - | Kingston V300 | 240G | 159 | 0.6625 | 240GB |
2.5" | Int | SSD | SATA? | - | Seagate 600 | 240G | 159 | 0.6625 | 240GB |
2.5" | Int | SSD | SATA? | - | Fujitsu | 256G | 170 | 0.6641 | 256GB |
2.5" | Int | SSD | SATA3 | - | Samsung 840 EVO | 250G | 170 | 0.6800 | 250GB |
2.5" | Int | SSD | SATA? | - | Kingston V300 | 480G | 329 | 0.6854 | 480GB |
2.5" | Int | SSD | SATA? | - | SanDisk Ultra Plus | 128G | 89 | 0.6953 | 128GB |
2.5" | Int | SSD | SATA? | - | Plextor M5-PRO | 256G | 179 | 0.6992 | 256GB |
2.5" | Int | SSD | mSATA3 | - | Samsung 840 EVO | 250G | 182 | 0.7280 | 250GB |
2.5" | Int | SSD | SATA? | - | Kingston V300 | 120G | 88 | 0.7333 | 120GB |
2.5" | Int | SSD | SATA? | - | Fujitsu | 512G | 383 | 0.7480 | 512GB |
2.5" | Int | SSD | SATA? | - | OCZ Vertec 450 | 128G | 97 | 0.7578 | 128GB |
2.5" | Int | SSD | SATA? | - | SanDisk Extreme | 240G | 185 | 0.7708 | 240GB |
2.5" | Int | SSD | SATA? | - | Fujitsu | 128G | 99 | 0.7734 | 128GB |
2.5" | Int | SSD | SATA? | - | SanDisk Extreme II | 480G | 379 | 0.7896 | 480GB |
2.5" | Int | SSD | SATA3 | - | Samsung 840 EVO | 120G | 95 | 0.7917 | 120GB |
2.5" | Int | SSD | SATA? | - | SanDisk Extreme II | 240G | 195 | 0.8125 | 240GB |
2.5" | Int | SSD | SATA? | - | Seagate 600 | 120G | 99 | 0.8250 | 120GB |
2.5" | Int | SSD | SATA? | - | Kingston HyperX | 240G | 199 | 0.8292 | 240GB |
2.5" | Int | SSD | SATA3 | - | Samsung 840 PRO | 512G | 439 | 0.8574 | 512GB |
2.5" | Int | SSD | SATA? | - | Intel 520 | 120G | 104 | 0.8667 | 120GB |
2.5" | Int | SSD | SATA? | - | Intel 530 | 240G | 209 | 0.8708 | 240GB |
2.5" | Int | SSD | SATA? | - | Kingston HyperX | 120G | 105 | 0.8750 | 120GB |
2.5" | Int | SSD | mSATA3 | - | Samsung 840 EVO | 120G | 105 | 0.8750 | 120GB |
2.5" | Int | SSD | SATA3 | - | Samsung 840 PRO | 256G | 232 | 0.9062 | 256GB |
2.5" | Int | SSD | SATA? | - | Intel 530 | 120G | 115 | 0.9583 | 120GB |
2.5" | Int | SSD | SATA? | - | SanDisk Extreme II | 120G | 118 | 0.9833 | 120GB |
2.5" | Int | SSD | SATA? | - | Kingston SMS200s3 | 120G | 119 | 0.9917 | 120GB |
2.5" | Int | SSD | mSATA3 | - | Intel 530 | 240G | 242 | 1.0083 | 240GB |
2.5" | Int | SSD | SATA? | - | Intel 530 | 180G | 184 | 1.0222 | 180GB |
2.5" | Int | SSD | SATA? | - | Plextor M5-PRO | 128G | 135 | 1.0547 | 128GB |
2.5" | Int | SSD | SATA? | - | Fujitsu | 64G | 69 | 1.0781 | 64GB |
2.5" | Int | SSD | SATA3 | - | Samsung 840 PRO | 128G | 138 | 1.0781 | 128GB |
2.5" | Int | SSD | SATA? | - | Kingston V300 | 60G | 65 | 1.0833 | 60GB |
2.5" | Int | SSD | mSATA3 | - | Intel 530 | 120G | 130 | 1.0833 | 120GB |
2.5" | Int | SSD | mSATA3 | - | Intel 525 | 120G | 139 | 1.1583 | 120GB |
2.5" | Int | SSD | SATA? | - | SanDisk Ultra Plus | 64G | 75 | 1.1719 | 64GB |
2.5" | Int | SSD | SATA? | - | Intel 730 | 240G | 285 | 1.1875 | 240GB |
2.5" | Int | SSD | SATA? | - | Intel S3500 | 240G | 288 | 1.2000 | 240GB |
2.5" | Int | SSD | SATA? | - | Kingston SMS200s3 | 60G | 76 | 1.2667 | 60GB |
2.5" | Int | SSD | SATA? | - | Intel S3500 | 160G | 209 | 1.3062 | 160GB |
2.5" | Int | SSD | SATA? | - | Intel S3500 | 120G | 164 | 1.3667 | 120GB |
2.5" | Int | SSHD | SATA? | 5400? | Seagate | 1TB | 129 | 0.1290 | 1000GB |
2.5" | Int | SSHD | SATA? | 5400? | Seagate | 500G | 85 | 0.1700 | 500GB |