183.23TB Host writes
MWI 1
Reallocated sectors : 6
MD5 OK.
Printable View
183.23TB Host writes
MWI 1
Reallocated sectors : 6
MD5 OK.
C300 Update
98.07 TiB, 67MWI, 1656 raw wear, MD5 OK, reallocated event still 1 and 2048 sectors. Switched to an updated version of Anvil's app, average speed over the past 8 hours is down to 60.5MiB/sec from 61.85ish.
First update:
OCZ Agility 3 AGT3-25SAT3-120G - a couple minor sector failures on one of the drives.
Corsair Force CSSD-F120GB2-BRKT - Nothing yet to report
OCZ Vertex 2 OCZSSD2-2VTXE60G - Nothing yet to report
Corsair Performance 3 Series CSSD-P3128GB2-BRKT - Both drives are throwing failures but drives are still functional
Crucial RealSSD C300 CTFDDAC064MAG-1G1 - Nothing yet to report
SAMSUNG 470 Series MZ-5PA128/US - Nothing yet to report
Intel 510 Series (Elm Crest) SSDSC2MH120A2K5 - One drive failed completely July 24 @ 9:14am and second drive is throwing failures but has yet to fail.
Intel X25-M SSDSA2MH160G2K5 - Nothing yet to report
Kingston SSDNow V+ Series SNVP325-S2B/128GB - a couple minor sector failures on one of the drives.
-------- End of line ----------
Any further questions?
Todays update;
147.1145 TiB
494 hours
Avg speed 89.77 MiB/s.
AD gone from 17 to 15.
P/E 2564.
MD5 OK.
Attachment 118134
184.88TB Host writes
MWI 1 (stopped moving)
Reallocated sectors : 6
MD5 OK, 35.62MiB/s avg (~17 hours)
http://www.xtremesystems.org/forums/...=1#post4854810
The short version being that I am using a kernel level module to continuously write sectors to SSDs and read them back to check for errors. Throwing failures implies that a write/read failed. (aka the data read does not match the data written) All sectors received an equal number of writes. Once 90% of sectors fail, the drive is considered failed.
All drives are all receiving an exactly equal distribution of writes at a constant speed of 50MB/s
SMART errors are not even looked at.
A failed sector is one that is unable to have a successful write and read operation after 200 attempts to write to the sector.
A failure is that the data read from the sector is not the same as data written to the sector.
The Intel drive was classified as failed the instant that 90% of all of the drive's sectors have failed.
Closer analysis shows that less than 1% of the data written to failed sectors matches what was actually written and that errors tended to start rapidly collecting near the end of its life. It performed quite well, until the first sector failure then the drive died after a mere couple more days of testing. The second drive is currently at 42% failed and is expected to die by tomorrow night.
Could you also post a screenshot with SMART parameters for failed drive and the drive which is expected to fail in next hours? By sector I guess you are referring to a LBA 512Byte right? Also, have you tried to let the failed drive in idle and try again. In some document posted somewhere at the beginning of the thread I saw a wear model that stated a much higher endurance if some idle time is taken between consecutive writes.
Experiment is being done on a Linux box, that does not have Xorg installed.
I classify a sector as 4KB of continuous block of flash.
I will retest the drive in a few moments to check to see if the 90% sector failure is still true.
However the statement "much higher endurance if some idle time is taken between consecutive writes", would mean endurance is better if you don't write to the drive much.(duh and your car will not run out of gas as quickly if you don't drive it much)
@nn_step, at what write size did your first Intel drive fail at (the one in the quote, above)?Quote:
Intel 510 Series (Elm Crest) SSDSC2MH120A2K5 - One drive failed completely July 24 @ 9:14am and second drive is throwing failures but has yet to fail.
@nm...your data is very vague. a lack of data would be a better way to classify it.
can you give us some specifics? amount of data written, time elapsed, etc?
its hardly helping to come to some sort of understanding when all you say is : there was a failure.
where, when, how, under what conditions? after what duration?
Assuming you wrote data continuously for 24th May 2011 at 50MiB/s on Intel 510 model, then this would be translated to 61*86400*50MiB = ~251.3TB . If WA is around 1.1 like on other models from endurance test, than this is means around 2150 cycles.
Also, according to endurance model posted by Ao1: http://www.xtremesystems.org/forums/...=1#post4861258 , if theory with recovery time proves to be right, then we should see much more cycles (it would take around 2000-2500 seconds between each page write at 50MiB/s) .
Could there be other factors that are breaking the Intel model so early (like a faulty power supply or SATA issues)? it's hard for me to believe that both models are failing so fast and so near one of each other
Will have a Vertex 2 60GB with LTT removed entering the testing within a week or two :) It's a V2 with 32nm Hynix NAND though :eh:
C300 update and updated charts later today :)
WOW nice to see that this thread really is starting to pick up some speed guys.
Thanks for all the hard work everyone !
241TiB. 31 reallocated sectors. MD5 OK. I THINK I found the hidden drive wear variable using Anvil's app (his special build for me). It is 120 right now and it is going up linearly.
C300 Update, charts next post
103.64TiB, 1750 raw wear, 65 MWI, reallocated still at 1 event / 2048 sectors, speed back up to 61.75MiB/sec, MD5 OK.
Updated charts :)
Host Writes So Far
Attachment 118204
Attachment 118205
(bars with a border = testing stopped/completed)
Raw data graphs
Writes vs. Wear:
Attachment 118206
MWI Exhaustion:
Attachment 118207
Writes vs. NAND Cycles:
Attachment 118208
Attachment 118209
Normalized data graphs
The SSDs are not all the same size, these charts normalize for available NAND capacity.
Writes vs. Wear:
Attachment 118210
MWI Exhaustion:
Attachment 118211
Write-days data graphs
Not all SSDs write at the same speed, these charts factor out write speeds and look at endurance as a function of time.
Writes vs. Wear:
Attachment 118212
MWI Exhaustion:
Attachment 118213
Approximate Write Amplification
Based on reported or calculated NAND cycles from wear SMART values divided by total writes.
Attachment 118214
If the current value is 120 and it's the same thing as =100-MWI (and MWI has gone negative), then your WA has dipped below its normal ~1.015x...and even dipped well below 1.00x.
Reported NAND Cycles / Calculated NAND Cycles via manual writing:
( 120 / 100 * 5000 ) / ( 241 / 40 * 1024) = .9725x WA
:(
Reallocated sectors seem to have been moving linearly for the 320 recently, maybe it's related to that? Or maybe it is wear, but not comparable to MWI? :shrug:
What exactly do you mean by write size.
Amount of data written is easy to calculate given the posted times and fixed data rates
My first post in regards to this test lists exactly the starting time of the test.
Where : 70 degree F basement (mine)
When : see above
How : Kernel module that I wrote
Conditions : All drives are written the exact same data at the exact same time, the data is random with high entropy.
Duration : see posted times above
One that I wrote myself, its only assumptions are the RAID cards being used and the timing chip that I am using
Yes there certainly are other factors that could cause earlier failure:
1) Intel drives are closer to ventilation than other drives.
2) Intel drives received 3% less sunlight than the other drives
3) Intel drives are connected to the leftmost power connector of the power supplies
4) The failed Intel drives have sequential serial numbers and could have been part of a bad batch
But I am continuing to check for other additional reasons for the failures.
the data is random with high entropy and the RAID cards have no problems sustaining the write/read rates
After 12 hours of off time, the Intel drive still has in excess of 90% of sectors failed.