PDA

View Full Version : HDD read/write issue's stop some BOINC threads



Red Maw
11-30-2011, 11:17 PM
3/4 of my WDFASS drives start to have issues reading and writing after a few months of use, particularly when you try to copy data to or from them. If you try to use the standard windows file copy program it will fail with some error about not being able to access the drive. I haven't seen it fail using robocopy yet, but the transfer speed will frequently drop below 30KB/s for entire files. When these access errors are encountered any programs accessing the drive lockup and will normally throw CRC errors (when applicable) once it times out (usually takes a minute or two to cancel processes). I was testing a WDFASS drive to see if it was time to RMA it by copying data from a WDEARS to the WDFASS (using robocopy) when I noticed that every time the system encountered what would be one of the longer "access errors" 2/50% (4 threads running atm) of my BOINC threads cpu time would drop to 0. This is odd, other than the fact that just two of them went down, because BOINC is on neither of the drives in the test and only the FASS drive goes unresponsive during these errors, all other drives are fully functional.

Yeah I know this is an issue that's not worth solving but I'm curious to see if anyone might have an idea as to what's going on as I would certainly like to know.

Thanks,

redmaw

Johnmark
12-01-2011, 03:29 PM
What does S.M.A.R.T have to say about the black drives drives that are failing? Are they all on the same machine? Have you changed cables lately?

And why do you say it's not worth solving, that would drive me nuts until I found the answer.

Red Maw
12-01-2011, 04:51 PM
It does drive me nuts, especially because I can't find an answer.

S.M.A.R.T. data is below(sorry it doesn't want to format nicer), 2 of 4 drives have been tested on a different machine with the same results, I've swapped sata cables with properly working drives and the non-FASS drives continue to function properly while the FASS drives continue to be problematic.

I've been at a complete loss for what the problem with the drives could be for over a year now (when the problems appear I just rma the drive :rofl:), only mentioned it because I've never seen it take out boinc threads before. The only reason I said it wasn't worth solving was because as long as BOINC is concerned it only costs a few minutes of run time on 2-4 threads a couple times a year.



ID Attribute Description Threshold Value Worst Data Status
01 Raw Read Error Rate 51 188 188 3582 OK: Value is normal
03 Spinup Time 21 142 142 14883 OK: Value is normal
04 Start/Stop Count 0 100 100 65 OK: Always passes
05 Reallocated Sector Count 140 200 200 0 OK: Value is normal
07 Seek Error Rate 0 200 200 0 OK: Always passes
09 Power-On Time Count 0 92 92 6529 OK: Always passes
0A Spinup Retry Count 0 100 253 0 OK: Always passes
0B Calibration Retry Count 0 100 253 0 OK: Always passes
0C Power Cycle Count 0 100 100 64 OK: Always passes
C0 Power-Off Retract Count 0 200 200 27 OK: Always passes
C1 Load/Unload Cycle Count 0 193 193 21067 OK: Always passes
C2 Temperature 0 119 105 33 OK: Always passes
C4 Reallocation Event Count 0 200 200 0 OK: Always passes
C5 Current Pending Sector Count 0 200 196 0 OK: Always passes
C6 Offline Uncorrectable Sector Count 0 197 197 1136 OK: Always passes
C7 Ultra ATA CRC Error Rate 0 200 200 0 OK: Always passes
C8 Write Error Rate 0 190 190 2115 OK: Always passes

D_A
12-01-2011, 04:58 PM
I'd suggest retiring the drives to machines that are only used for dedicated crunchers and using drives you can rely on for your desktop machines.

Red Maw
12-01-2011, 08:28 PM
As long as they're under warranty I'll be rma'ing them. These are 2TB 7200rpm hdd's, too expensive and overkill for a dedicated cruncher (plus I can't afford to replace them right now) :p:

Ironically though as the write speed doesn't want to go any higher than 1MB/s anymore it might not even be fit for a dedicated cruncher lol

D_A
12-01-2011, 08:50 PM
RMA sounds like a good option then.