Saturday, February 04, 2012

Intra-disk Error Correction: RAID-4 in shingled-write drives

High density shingled-write drives cannot succeed without especial attention being paid to Error Correction, not just error detection.
Sony/Philips realised this when developing the Compact Digital Audio Disk (CD) around 1980 and then again in 1985 with the "Yellow Book" CD-ROM standard for data-on-CD. The intrinsic bit error rate of ~ 1 in 105 becomes "infinitesimal" to quote one tutorial, with burst errors of ~4,000 bits corrected by the two lower layers.

Error rates and sensitivity to defects increase considerably as feature sizes reach their limit. The 256Kbit DRAM chips took years to come into production after 64Kbit chips because manufacturing yields were low. Almost every chip worked well enough, but had some defects causing it to be failed in testing. The solution was to overbuild the chips and swap defective columns with spares during testing.

Shingled-write disks, with their "replace whole region, never update-in-place", allow for a different class of Error Protection. RAID techniques with fixed parity disks seem a suitable candidate when individual sectors are never updated. Network Appliance very successfully leveraged this with their WAFL file system.

That shingled-write disks require good Error Correction should be without dispute.
What type of ECC (Error Correcting Code) to choose is an engineering problem based on the expected types of errors and the level of Data Protection required. I've previously written that for backup and archival purposes, the probable main uses of shingled-write disks, bit error rates of 1 in 1060 should be a minimum.

One of the advantages of shingled-write disks, is that each shingled-write region can be laid down in one go from a Flash memory buffer.
It can then be re-read and rewritten catering for the disk characteristics found:
  • excessive track cross-talk,
  • writes affected by excessive head movement (external vibration),
  • individual media defects or moving contamination,
  • areas of poor media, and
  • low signal or high signal-to-noise ratio due to age, wear or production variations.
Depending on the application, multiple rewrites may be attempted.
It would even be possible, given spare write-regions, for drives to periodically read and rewrite all data to the new areas. This is fraught because the extra "duty cycle" will decrease drive life plus if the drive finds uncorrectable errors when the attached host(s) weren't addressing it, what should be done?

Reed-Solomon encoding is well proven in Optical Disks: CD, CD-ROM and DVD and probably in-use now for 2Kb sector disks.
Reed-Solomon codes can be "tuned" to the application, the amount of parity overhead can be varied and other techniques like scrambling and combined in Product Codes.

R-S codes have a downside: complexity of encoders and decoders.
[This can mean speed and throughput as well. Some decoding algorithmns require multiple passes to correct all errors.]

For a single platter shingled-write drive, Error Correcting codes (e.g. Reed-Solomon) are the only option to address long burst errors caused by recording drop-outs.

For multi-platter shingled-write disks, another option is possible:
 RAID-4, or block-wise parity (XOR) on a dedicated drive (in this case, 'surface').

2.5 in drives can have 2 or 3 platters, i.e. 4 or 6 surfaces.
Dedicating one surface to parity gives 25% and 16.7% overhead respectively, higher than the ~12.5% Reed-Solomon overhead in the top layer of CD-ROM's.
With 4 platters, or 8 surfaces, overhead is 12.5%, matching that of CD-ROM, layer 3.

XOR parity generation and checking is fast, efficient and well understood, this is it's attraction.
But despite a large overhead, it:
  • can at best only correct a single sector in error, fails on two dead sectors in the sector set,
  • relies on the underlying layer to flag drop-outs/erasures, and
  • relies on the CRC check to be perfect.
If the raw bit error rate is 1 in 1014 with 2Kb sectors. The probability of any sector  having an uncorrected error is 6.25 x 10 -9.
The probability of two sectors in a set being in error is:
6.25 x 10 -9 * 6.25 x 10 -9 = 4 x 10 -17

This is well below what CD-ROM achieves.
But, to give intra-disk RAID-4 its due:
  • corrects a burst error of 16,000 bits. Four times the CD limit.
  • will correct every fourth sector on each surface
  • is deterministic in speed. Reed-Solomon decoding algorithms can require multiple passes to fully correct all data.
I'm thinking the two schemes could be used together and would complement each other.
Just how, not yet sure. A start would be to group together 5-6 sectors with a shared ECC in an attempt to limit the number of ganged failed sector reads in a RAID'd sector set.

No comments: