Data corruption durning data xfer between PATA & SATA drives??

**videobruce** · 02-07-2008, 07:16 AM

I had a report that someone had problems xfering data from one PATA HDD to a SATA HDD. No details, but I don't believe any type of RAID was used.
Other than bad cable, bad controller, or bad MB or PS, anyone hear of problems??

**stevecs** · 02-07-2008, 09:37 AM

Driver issue or bad drive would be the other two. But no, no issues with the idea, it's just one block device going to another.

**zanzabar** · 02-07-2008, 12:41 PM

if u copy large batch files with non ecc ram or any other check means then u will get corrupt data over large copies its just how it works

**stevecs** · 02-07-2008, 02:06 PM

Well, I'll grant zanzabar here a semi-nod, yes it's possible, even expected, however the numbers are generally larger than the realm many run into here currently. I do see errors occasionally/rarely myself even w/ ecc & checks but that's over 20+TiB. Not that one data point proves anything.

Basically ram gets an error natively in the order of 1:1x10^12 bits, plus background radiation of about 1bit per GB per month. SATA & SAS cables have a similar error rate of about 1bit in 1x10^12 bits for 1 meter length. HD's around 1bit in 1x10^14bits. tape backup around 1 in 1x10^17bits.

Suffice it to say, you should be able to copy any current single HD to any other current single HD without an error.

**tiro_uspsss** · 02-08-2008, 07:29 PM

yeah i've done large data transfers before - across all types of interfaces - i dont get huge amounts of stuff corrupted - but corruption definitely happens... i wish there were a fix (anyone?)

**stevecs** · 02-08-2008, 07:36 PM

Think it's already been covered, check cables, interfaces, controllers, drivers, MB, et al. If you have the option try it at a lower interface speed in which case it's probably due to a badly shielded cable. You can also try the copy in safe mode which could point to a software/os/driver issue if it doesn't exhibit the same problems.

**videobruce** · 02-09-2008, 06:00 AM

I was just a generic question as someone I know had a problem. I have no details. I was just wondering if I should be looking out for this since I have alot of data to xfer from PATA to SATA drives.

Sorry for this possible silly question, but what does memory have to do with a xfer of data from one internal drive to another?

**stevecs** · 02-09-2008, 09:01 AM

Generally unless you have both drives on the same controller and have an intelligent controller card that can do a 1:1 block map (and a program designed to do that) you use your main memory.

First the application, OS and driver are involved (assuming your doing this from an OS not from the bios on the controller card) those functions are in main memory. All transfers are initiated by the app/os/{filesystem if not block level}/driver -> disk as 'read x blocks at location Y' that data is then (usually) DMAed into main memory and then the OS then tells drive 'B' to 'write x blocks at location Y' from the DMAed memory space. So every block (every bit actually) is written and read from main memory in a copy and goes through several layers of code.

Technically even if you do it from a controllers bios system you have a mini version of the above but you're mainly running on the firmware and the small amount of cache memory on the controller. You don't have a 'filesystem' at that point as it's block level but the same bit error rates come into play.

To give you an idea. I copy/read/write about 1000TiB/month on the array here at home. This is both for backups/restores, block level verifications, file level verifications, md5 hash compares et al. so I can find any single bit-error in the data. I find a single bit error every couple months somewhere in the chain. There's no real way today to correct bit errors like this besides restoring from backup from a known-good copy. (One reason why I'm trying to get raid vendors to do parity checks/corrects on READS which would really help this but so far I'm a lone voice).

**Gogeta** · 02-09-2008, 12:29 PM

Originally Posted by stevecs

This is both for backups/restores, block level verifications, file level verifications, md5 hash compares et al. so I can find any single bit-error in the data. I find a single bit error every couple months somewhere in the chain.

If you don't mind me asking, what software do you use for these types of verification and comparison? From the looks of your sig I'd guess its Linux-based.

**mnewxcv** · 02-09-2008, 12:43 PM

with ecc ram there is no corruption?

**stevecs** · 02-09-2008, 12:55 PM

@Gogeta: Mainly just threw together some simple scripts that I haven't had the chance to re-write into C. For volume checks I run the areca disk driver under linux for complete volume checks. I then use md5 (128bit) for every file inode in the system and then compare it over time. If anything changes in a static file it will show up as a mis-compare (looking to use perhaps SHA-160 since some of the file sizes are kind of large). Then to test out the un-used space (or a portion thereof which does some testing of the cables/controller/disk subsystem) a small program that writes out a large file every hour with certain bit patterns and then reads it back in to compare. The problem I have at this point is that this works 'ok' but the problems are:

- if a particular disk has a bad sector or returns a bad value since it's in a raid setup there is no real way that I've found yet to determine _which_ disk the issue came from (just that I got a bad answer and need to replace the file but that's pretty much about it. I would like to flag the disk so I can replace it or chart the issues).

- This is all reactive testing nothing proactive. SMART is very dumb in these instances. I can only find out issues after they've happened.

- Has to be double checked as there is no proof that it's a disk subsystem that causes the problem or be just transitory (could be cable, memory, random background radiation, firmware, et al).

- This all dramatically increases disk load (more than quadruples the I/O to the subsystem and cpu just to do the checks).

@mnewxcv - No, ECC helps mitigate single-bit errors in memory (can correct them as it uses a 3 position hamming code) but only detect a 2-bit error (cannot correct). There are memory scrubbers and you can also on server systems set up mirrored memory that help further but there is no 100% proof against anything you can just push it out. Data integrity is a field that is lacking in many areas (kind of like an iceberg at sea, people understand it's there see the 10% floating above, but not the 90% that will kill you.

)

**Gogeta** · 02-09-2008, 01:41 PM

Thanks for the reply, steve. After I read your post about performing these tests it set off major paranoia alarms inside my head as I realized I'd never even considered a data verification process even though I know the risk is real. Given my extremely limited experience with scripting I think my best option is to rely on the Areca driver functionality for volume checking. Are you aware of any alternative solutions for these types of data verification?

**stevecs** · 02-09-2008, 01:55 PM

Remember that the areca volume check is a raid-array verification. I.e. it DOES NOT VERIFY DATA. It verifies that the parity is the correct calculation of the bits in the data stripe width. Ie. It has no idea if the file itself is correct (it doesn't even know if a file or even a filesystem is there). It just sees if the bits match and if not (like with raid 3/4/5) is like flipping a coin, it will (usually user choice at check time) either 'assume' the data is good _or_ the parity is good and 'fix' the other. Which could actually be destroying it. Remember that RAID is for availability _not_ data integrity.

for a real-simple md5 check under linux you can use this early version of a script I have here (the current one is very convoluted and not generic enough to be of real use for you (like I said no time to really clean things up)

). Anyway it can get you started.

Code:

#!/bin/bash

##
# Filesystem MD5 Check (low level drive & file checking)
##
##
# Requirements:
#
# md5sum
# find
##

# Set argument variables & usage
MNTPNT=$1
CHECK=$2
IGNORE=$3

usage ()
{
        echo -e "usage: `basename $0` MOUNTPOINT [create|check] {IGNOREDIR}\n"
}

# Get arguments

if [ -z "${MNTPNT}" ]; then
        usage
        exit 1
fi



# Set static variables
CURDATE=`date +&#37;Y%m%d`
OLDDATE=`date +%Y%m`"01"
OUTDIR="/var/log/md5sum"
OLDIFS="$IFS"
IFS="
"


# CHECK/Verify MD5SUM function
if [ "${CHECK}" = "check" ]; then
   if [ -f "$OUTDIR/md5sum.$OLDDATE" ]; then
      md5sum -c "$OUTDIR/md5sum.$OLDDATE" | grep -v ": OK" >> $OUTDIR/md5check.$CURDATE
   else
      echo "ERR: old md5sum file does not exist"
   fi
fi

# Create MD5SUM Function
if [ "${CHECK}" = "create" ]; then
   if [ -z "${IGNORE}" ]; then
      for FILE in `find "${MNTPNT}" -type f -print0 | xargs -0 -i echo -e -n "{}\n"` ; do
         md5sum -b "$FILE" >> $OUTDIR/md5sum.$CURDATE
      done
   else
      for FILE in `find "${MNTPNT}" -type f -print0 | xargs -0 -i echo -e -n "{}\n" | grep -v $IGNORE` ; do
         md5sum -b "$FILE" >> $OUTDIR/md5sum.$CURDATE
      done
   fi
fi

As for other tools, nothing really besides what you write yourself (or at least nothing that I've really found anywhere and that includes home, PC, workstation, midrange (unix/as400) and mainframes. The industry seems to really have blinders on and I have no idea why.

**Gogeta** · 02-09-2008, 09:23 PM

Thank you very much for sharing your code. It will definitely give me at least some peace of mind and hopefully a better understanding of the concepts.

When you mentioned the volume check earlier I assumed it was referring to an actual partition and not the RAID volume. Microsoft refers to partitions as volumes...MCSE training getting in the way again...

OT question: If I'm using 3 500GB disks on an ARC-1210 can I create two distinct arrays? For example, use 750GB as RAID-5 and 250GB as RAID-1? If its possible, what is the impact on performance? It seems like it would have to utilize the OS a lot more as it would deal directly with a file system or partition instead of just the block device as a whole.

**Movieman** · 02-09-2008, 09:33 PM

Originally Posted by zanzabar

if u copy large batch files with non ecc ram or any other check means then u will get corrupt data over large copies its just how it works

Very good point.
I've transfered VERY large files(nicname

for years from raptors to old IDE storage drives but always on systems that used ECC memory and never an issue.

**stevecs** · 02-09-2008, 10:42 PM

@gogeta no problem, like I said it's bloody simple code but if it helps have at it.

As for the MS terminologies, they do seem to cause a lot of confusion I've noticed even here at work. To put it basically anything from a partition/vtoc level on up really has no bearing on a lower level block device (a vtoc/partition map) is just data nothing more than an abstration.

As for multiple raids yes it is possible and common just had a discussion on it over here: http://www.xtremesystems.org/forums/...=171561&page=2 As for what impact it has, can't give real numbers without running models. But basically you have two issues, drive IOPS (finite resource) and the raid controller's cpu (another finite resource) are two of the main issues you have when doing that.

**IanB** · 02-10-2008, 10:38 AM

I guess I should mention QuickPAR here. For critical static data that isn't changing, one option to ensure long-term accuracy of the data is to run QuickPAR over the folder and create a few repair blocks. That way if you ever find an odd hash error you can easily repair the files. The catch with this is the last version released doesn't support subfolders, so you can only PAR2 files in a single folder, not an entire folder tree.

I've been involved with QuickPAR for a long while and was part-way through writing an alternative, hopefully improved version with support for subfolders when development time dried up. I'm hoping to restart that work. Maybe some of my code could be retargetted at a more automatic test-and-repair regime than a simple manual PAR2 test...

**stevecs** · 02-10-2008, 11:23 AM

Yes, but that does not solve the problem. It's no different than doing a backup and a compare of the data. It has the same problems with no verification on reads (data can be ok on the disk but what you get to your application could be corrupt). Disk data is not checked continuously so you don't have a means to really know when something was corrupted (ie, what backup do you want to restore from?) And with the error correction you are making the assumption that your hash is valid (which THAT data could be wrong and your real data correct so by 'correcting' the mistake you're actually messing up the data).

ZFS is closer but still has a long way to go as well. Just getting the issues out there is important though as for the past 60 years in HD's data integrity has played a long second place in people's minds.

Even with raid-6 which has the option of having 3 sets of information (P, Q, and raw) when an error occurs I am finding that I doubt that all three are compared to make the decision as to what way to correct the error. (ie, if raw+Q agree then P is in error. If p+q agree then the raw, etc.). Plus things like this still don't solve multi-bit errors without different algorithms (or applications of current ones). Now (besides smaller home users) we're at a point that due to the sheer size of data/storage space availability is not as big of an issue as integrity when BER rates now happen say statistically every 10 reads of a 1TB HD (which for some environments is like every couple days) . For laptop drives (which are used more in 1U appliances) it's much worse. Who cares if it's available if it's wrong?

**IanB** · 02-10-2008, 12:32 PM

Originally Posted by stevecs

Yes, but that does not solve the problem. It's no different than doing a backup and a compare of the data. It has the same problems with no verification on reads (data can be ok on the disk but what you get to your application could be corrupt). Disk data is not checked continuously so you don't have a means to really know when something was corrupted (ie, what backup do you want to restore from?) And with the error correction you are making the assumption that your hash is valid (which THAT data could be wrong and your real data correct so by 'correcting' the mistake you're actually messing up the data).

Well, yes and no... After QP has processed the fileset to create the PAR2 file (containing the block data hashes) it rechecks the fileset using those hashes, so in theory a single bit error caused by a memory glitch or transfer glitch should be shown up, as a second read of the data would not have the glitch. The PAR2 would be discarded if an error is found and the computation restarted. So at that point you have an accurate set of hashes of the file at that particular moment in time, so you can be confident of restoring only bad blocks using the repairs created at that point. The work to create a repair block can also be checked, I'm sure.

But then we come down to the grittier problem. I can easily write a software routine to do two reads of the data block and compare the two for an odd bit error caused by memory corruption or data transfer glitch. But the problem is that they will both be created by one physical read of the disk and cached by the OS in some block of memory, ergo be identical.

So the problem isn't fixable at the application level, it's something that would have to be built into the OS to check the data just read against a subsequent physical disk read, or something more at the drive level. Does the transfer method (SCSI/SATA) have any error-correction in the stream to catch errors in-time?

**stevecs** · 02-10-2008, 01:34 PM

Well, yes, a read, hash, re-read, compare would help. It depends on the size of the data to see how many times (or how many bits of error correction) are needed to reach X integrity). The thing is that this type of data integrity needs to be done from the 'ground up' (ie, sector storage on disk, interface, transfer cable, controller, computer internal data path, driver, filsystem, os and then the application layer.

Actually, it wouldn't be cached if you did this at the filesystem layer (NOT application). Ie, replace NTFS with a new filesystem, I would say something akin to ZFS (as it already has a lot of the concepts there but not really turned/on used). That way you bypass (or actually control) any caching yourself and you can always do direct reads. Granted you may have a hardware controller issue or even the drive itself but those generally have some means to turn off such things (at least higher end ones as things like read-ahead are bad in some cases).

The transfer method itself doesn't really have error correction just shielding. SAS has I believe either ECC or Hamming code correction from the controller to the drive (ie, sas protocol over the cable). SCSI I can't remember I thought it did something like a real basic hash but it's been a while since I dug into that. In either case though the error ratio that I've seen from IBM/EMC/STK(sun) et al is 1 bit in 1x10^12 bits transfered for 1 meter cable and that's for a single bit error. If there are two bits then it can't do anything with it. Also there is a chance that depending on how many bits get mixed (2+) that it wouldn't be detected as it's just a simple hash.

Thread: Data corruption durning data xfer between PATA & SATA drives??

Thread Tools

Search Thread

Rate This Thread

Display

Data corruption durning data xfer between PATA & SATA drives??

Bookmarks

Bookmarks

Posting Permissions