here.![]()
Each subsystem has their own error rates; RAM 10^12 for ecc; cables also have their own error rates (about the same as for sata/sas/scsi they use a parity checksum algorithm) and then the drives. You can't really test one without testing the full system (every component has an error rate associated with it and you go through those components multiple times). What I do is a MD5/SH1 check of data files on the systems here and compare them to initial values. If there is a mis-match I compare again as a tie-breaker (ie, an error occurred not with the data on the media itself but somewhere alone the path).
Data errors are frequent. I see real-world bit errors (corruptions on files) of the rate of 1 file per month on each ~20TiB of storage at home (which I monitor much closer than at work). At work I see it comparable to this when you increase storage size. However I am only able to monitor a small fraction of the ~6PiB of dasd that we have deployed there. From what I've been able to gather from other sites (cern/fermilab) both also have found the same errors plus substantial multi-bit errors in ECC memory (ecc can correct a single bit error but not multi-bit errors, only report those). Which is disconcerting.
There is no real easy way to eliminate it, and the bigger problem is that hardly anyone (I can probably count them on one hand) the number of people that are actively monitoring this on production systems.
At this point to mitigate the issues with current hardware you can do several things:
- construct the array with enough redundancy (and using the right drives) to have a lower probability of read errors. I posted a spreadsheet in the main sticky here that can help with that. Right now I would say ignore the 1.5TB drives as they (from a BER rating) suck.
- At the hardware subsystem use raid-6 (raid-10 is faster for IOPS but it has no parity block checking, i.e. no means to determine which copy is right or wrong. Likewise parity raids (3/4/5) have the same problem. raid-6 has 3 checks (one raw data, two parities) which allow for a tie break to correct data. Run full checks often (weekly).
- create and maintain a bit hash (md5/sha1/et al) of your data files and compare them to what's on the drive often (this will provide your window as to /when/ a corruption occurs (between the check intervals).
- have an up to date backup strategy (so you can restore files that get corrupted (remember to check tapes as well, as the same corruption occurs there, keep multiple copies).
- If available try ZFS with read-checking enabled (though beware of it's limitations, it's not good for all cases).
All in all this is the next big 'iceburg' that we're going to be facing in computers I think. Mainly as so many people don't even think about it and with the amount of data that is present today & growth rate everyone is assured to have some corrupted data on their system. With no-one looking, no-one knows what that is until it's too late.
I haven't been over at AVS for a long while should probably go over there. Just did a fast search there though and if it's the same thread of him making 12 drive wide raid-6's w/ 1TB drives that's (10+2 raid bunched into a raid 6+0 striping model) that would give him ~72TiB usable space (assuming no hot spares)) and a probability of not reading all sectors in the resultant array of ~9%. Though I don't think he has much storage knowledge yet (ie, for arrays of that size/complexity you want to use a volume manager (LVM/veritas/ et al), plus getting more than say 16 drives off a single raid controller today will max you out in performance, so if he's looking for that (he mentions iscsi which would imply performance requirements) he should have several cards (ideally one raid card for each external array of 16 drives max) and no mention of any clearly defined target goals (when dropping that kind of cash (not even mentioning the ~15K for a tape backup solution for it) you should have something defined as to what you're looking to accomplish). But that's a whole architecture discussion.



Reply With Quote
Bookmarks