SSD Write Endurance 25nm Vs 34nm

Printable View

Show 100 post(s) from this thread on one page

07-24-2011, 03:03 PM
Anvil

183.23TB Host writes
MWI 1
Reallocated sectors : 6

MD5 OK.
07-24-2011, 03:26 PM
johnw

Quote:

Originally Posted by johnw

219.480 TiB, 603 hours, sa177: 1/1/18053

230.010 TiB, 631 hours, sa177: 1/1/18925

Average speed reported by Anvil's app has been steady at about 112MB/s.

The other two unknown SMART attributes, 178 and 235, are still at 72/72/276 and 99/99/2, just as they were when the SSD was fresh out of the box.
07-24-2011, 04:07 PM
Vapor

C300 Update

98.07 TiB, 67MWI, 1656 raw wear, MD5 OK, reallocated event still 1 and 2048 sectors. Switched to an updated version of Anvil's app, average speed over the past 8 hours is down to 60.5MiB/sec from 61.85ish.
07-24-2011, 04:14 PM
nn_step

First update:

OCZ Agility 3 AGT3-25SAT3-120G - a couple minor sector failures on one of the drives.
Corsair Force CSSD-F120GB2-BRKT - Nothing yet to report
OCZ Vertex 2 OCZSSD2-2VTXE60G - Nothing yet to report
Corsair Performance 3 Series CSSD-P3128GB2-BRKT - Both drives are throwing failures but drives are still functional
Crucial RealSSD C300 CTFDDAC064MAG-1G1 - Nothing yet to report
SAMSUNG 470 Series MZ-5PA128/US - Nothing yet to report
Intel 510 Series (Elm Crest) SSDSC2MH120A2K5 - One drive failed completely July 24 @ 9:14am and second drive is throwing failures but has yet to fail.
Intel X25-M SSDSA2MH160G2K5 - Nothing yet to report
Kingston SSDNow V+ Series SNVP325-S2B/128GB - a couple minor sector failures on one of the drives.

-------- End of line ----------

Any further questions?
07-24-2011, 04:44 PM
johnw

Quote:

Originally Posted by nn_step

Any further questions?

Yeah, what are you testing and how are you testing it?
07-24-2011, 11:02 PM
Ao1

Quote:

Originally Posted by nn_step

Any further questions?

Are the SF drives untrottled?
07-25-2011, 03:47 AM
sergiu

Quote:

Originally Posted by nn_step

Any further questions?

What do you understand by "throwing failures but has yet to fail"? Could you give us some details about what is happening? Also, how did Intel drive failed? did it reported SMART errors that would have indicated an imminent failure?
07-25-2011, 03:53 AM
B.A.T

1 Attachment(s)

Todays update;

147.1145 TiB
494 hours
Avg speed 89.77 MiB/s.
AD gone from 17 to 15.
P/E 2564.
MD5 OK.

Attachment 118134
07-25-2011, 04:46 AM
Anvil

184.88TB Host writes
MWI 1 (stopped moving)
Reallocated sectors : 6

MD5 OK, 35.62MiB/s avg (~17 hours)
07-25-2011, 07:16 AM
nn_step

Quote:

Originally Posted by johnw

Yeah, what are you testing and how are you testing it?

http://www.xtremesystems.org/forums/...=1#post4854810

The short version being that I am using a kernel level module to continuously write sectors to SSDs and read them back to check for errors. Throwing failures implies that a write/read failed. (aka the data read does not match the data written) All sectors received an equal number of writes. Once 90% of sectors fail, the drive is considered failed.

Quote:

Originally Posted by Ao1

Are the SF drives untrottled?

All drives are all receiving an exactly equal distribution of writes at a constant speed of 50MB/s

Quote:

Originally Posted by sergiu

What do you understand by "throwing failures but has yet to fail"? Could you give us some details about what is happening? Also, how did Intel drive failed? did it reported SMART errors that would have indicated an imminent failure?

SMART errors are not even looked at.
A failed sector is one that is unable to have a successful write and read operation after 200 attempts to write to the sector.
A failure is that the data read from the sector is not the same as data written to the sector.
The Intel drive was classified as failed the instant that 90% of all of the drive's sectors have failed.

Closer analysis shows that less than 1% of the data written to failed sectors matches what was actually written and that errors tended to start rapidly collecting near the end of its life. It performed quite well, until the first sector failure then the drive died after a mere couple more days of testing. The second drive is currently at 42% failed and is expected to die by tomorrow night.
07-25-2011, 07:30 AM
sergiu

Could you also post a screenshot with SMART parameters for failed drive and the drive which is expected to fail in next hours? By sector I guess you are referring to a LBA 512Byte right? Also, have you tried to let the failed drive in idle and try again. In some document posted somewhere at the beginning of the thread I saw a wear model that stated a much higher endurance if some idle time is taken between consecutive writes.
07-25-2011, 07:40 AM
nn_step

Quote:

Originally Posted by sergiu

Could you also post a screenshot with SMART parameters for failed drive and the drive which is expected to fail in next hours? By sector I guess you are referring to a LBA 512Byte right? Also, have you tried to let the failed drive in idle and try again. In some document posted somewhere at the beginning of the thread I saw a wear model that stated a much higher endurance if some idle time is taken between consecutive writes.

Experiment is being done on a Linux box, that does not have Xorg installed.
I classify a sector as 4KB of continuous block of flash.
I will retest the drive in a few moments to check to see if the 90% sector failure is still true.
However the statement "much higher endurance if some idle time is taken between consecutive writes", would mean endurance is better if you don't write to the drive much.(duh and your car will not run out of gas as quickly if you don't drive it much)
07-25-2011, 08:21 AM
mgoldshteyn

Quote:

Intel 510 Series (Elm Crest) SSDSC2MH120A2K5 - One drive failed completely July 24 @ 9:14am and second drive is throwing failures but has yet to fail.

@nn_step, at what write size did your first Intel drive fail at (the one in the quote, above)?
07-25-2011, 09:08 AM
Computurd

@nm...your data is very vague. a lack of data would be a better way to classify it.
can you give us some specifics? amount of data written, time elapsed, etc?
its hardly helping to come to some sort of understanding when all you say is : there was a failure.
where, when, how, under what conditions? after what duration?
07-25-2011, 12:22 PM
johnw

Quote:

Originally Posted by johnw

230.010 TiB, 631 hours, sa177: 1/1/18925

237.913 TiB, 652 hours, sa177: 1/1/19564

Average speed reported by Anvil's app has been steady at about 112MB/s.

The other two unknown SMART attributes, 178 and 235, are still at 72/72/276 and 99/99/2, just as they were when the SSD was fresh out of the box.
07-25-2011, 12:27 PM
johnw

Quote:

Originally Posted by nn_step

Experiment is being done on a Linux box, that does not have Xorg installed.

So post the text from smartctl instead of a screenshot.

What program are you using to do the test on the SSDs? If it is not open source, what has been done to debug and validate the program?
07-25-2011, 12:49 PM
sergiu

Quote:

Originally Posted by nn_step

Experiment is being done on a Linux box, that does not have Xorg installed.
I classify a sector as 4KB of continuous block of flash.
I will retest the drive in a few moments to check to see if the 90% sector failure is still true.
However the statement "much higher endurance if some idle time is taken between consecutive writes", would mean endurance is better if you don't write to the drive much.(duh and your car will not run out of gas as quickly if you don't drive it much)

Assuming you wrote data continuously for 24th May 2011 at 50MiB/s on Intel 510 model, then this would be translated to 61*86400*50MiB = ~251.3TB . If WA is around 1.1 like on other models from endurance test, than this is means around 2150 cycles.
Also, according to endurance model posted by Ao1: http://www.xtremesystems.org/forums/...=1#post4861258 , if theory with recovery time proves to be right, then we should see much more cycles (it would take around 2000-2500 seconds between each page write at 50MiB/s) .
Could there be other factors that are breaking the Intel model so early (like a faulty power supply or SATA issues)? it's hard for me to believe that both models are failing so fast and so near one of each other
07-25-2011, 01:15 PM
Ao1

Quote:

Originally Posted by nn_step

All drives are all receiving an exactly equal distribution of writes at a constant speed of 50MB/s

So presumably you are either using compressible data or your SF drives are not throttled......:shrug:
07-25-2011, 02:30 PM
Vapor

Will have a Vertex 2 60GB with LTT removed entering the testing within a week or two :) It's a V2 with 32nm Hynix NAND though :eh:

C300 update and updated charts later today :)
07-25-2011, 03:30 PM
bulanula

WOW nice to see that this thread really is starting to pick up some speed guys.

Thanks for all the hard work everyone !
07-25-2011, 03:45 PM
One_Hertz

241TiB. 31 reallocated sectors. MD5 OK. I THINK I found the hidden drive wear variable using Anvil's app (his special build for me). It is 120 right now and it is going up linearly.
07-25-2011, 06:16 PM
Vapor

C300 Update, charts next post

103.64TiB, 1750 raw wear, 65 MWI, reallocated still at 1 event / 2048 sectors, speed back up to 61.75MiB/sec, MD5 OK.
07-25-2011, 06:22 PM
Vapor

11 Attachment(s)

Updated charts :)

Host Writes So Far

Attachment 118204

Attachment 118205
(bars with a border = testing stopped/completed)

Raw data graphs

Writes vs. Wear:
Attachment 118206

MWI Exhaustion:
Attachment 118207

Writes vs. NAND Cycles:
Attachment 118208

Attachment 118209

Normalized data graphs
The SSDs are not all the same size, these charts normalize for available NAND capacity.

Writes vs. Wear:
Attachment 118210

MWI Exhaustion:
Attachment 118211

Write-days data graphs
Not all SSDs write at the same speed, these charts factor out write speeds and look at endurance as a function of time.

Writes vs. Wear:
Attachment 118212

MWI Exhaustion:
Attachment 118213

Approximate Write Amplification
Based on reported or calculated NAND cycles from wear SMART values divided by total writes.

Attachment 118214
07-25-2011, 06:46 PM
Vapor

Quote:

Originally Posted by One_Hertz

I THINK I found the hidden drive wear variable using Anvil's app (his special build for me). It is 120 right now and it is going up linearly.

If the current value is 120 and it's the same thing as =100-MWI (and MWI has gone negative), then your WA has dipped below its normal ~1.015x...and even dipped well below 1.00x.

Reported NAND Cycles / Calculated NAND Cycles via manual writing:
( 120 / 100 * 5000 ) / ( 241 / 40 * 1024) = .9725x WA

:(

Reallocated sectors seem to have been moving linearly for the 320 recently, maybe it's related to that? Or maybe it is wear, but not comparable to MWI? :shrug:
07-25-2011, 07:30 PM
nn_step

Quote:

Originally Posted by mgoldshteyn

@nn_step, at what write size did your first Intel drive fail at (the one in the quote, above)?

What exactly do you mean by write size.

Quote:

Originally Posted by Computurd

@nm...your data is very vague. a lack of data would be a better way to classify it.
can you give us some specifics? amount of data written, time elapsed, etc?
its hardly helping to come to some sort of understanding when all you say is : there was a failure.
where, when, how, under what conditions? after what duration?

Amount of data written is easy to calculate given the posted times and fixed data rates

My first post in regards to this test lists exactly the starting time of the test.

Where : 70 degree F basement (mine)
When : see above
How : Kernel module that I wrote
Conditions : All drives are written the exact same data at the exact same time, the data is random with high entropy.
Duration : see posted times above

Quote:

Originally Posted by johnw

So post the text from smartctl instead of a screenshot.

What program are you using to do the test on the SSDs? If it is not open source, what has been done to debug and validate the program?

One that I wrote myself, its only assumptions are the RAID cards being used and the timing chip that I am using

Quote:

Originally Posted by sergiu

Assuming you wrote data continuously for 24th May 2011 at 50MiB/s on Intel 510 model, then this would be translated to 61*86400*50MiB = ~251.3TB . If WA is around 1.1 like on other models from endurance test, than this is means around 2150 cycles.
Also, according to endurance model posted by Ao1: http://www.xtremesystems.org/forums/...=1#post4861258 , if theory with recovery time proves to be right, then we should see much more cycles (it would take around 2000-2500 seconds between each page write at 50MiB/s) .
Could there be other factors that are breaking the Intel model so early (like a faulty power supply or SATA issues)? it's hard for me to believe that both models are failing so fast and so near one of each other

Yes there certainly are other factors that could cause earlier failure:
1) Intel drives are closer to ventilation than other drives.
2) Intel drives received 3% less sunlight than the other drives
3) Intel drives are connected to the leftmost power connector of the power supplies
4) The failed Intel drives have sequential serial numbers and could have been part of a bad batch

But I am continuing to check for other additional reasons for the failures.

Quote:

Originally Posted by Ao1

So presumably you are either using compressible data or your SF drives are not throttled......:shrug:

the data is random with high entropy and the RAID cards have no problems sustaining the write/read rates

After 12 hours of off time, the Intel drive still has in excess of 90% of sectors failed.

Show 100 post(s) from this thread on one page