The life expectancy of SSD's

**Ao1** · 03-26-2009, 02:42 AM

Silicon System provide a tool that monitors write/erase cycles of each block & spare blocks. (SiSmart) It can be downloaded if you register as a partner on their portal.

http://www.siliconsystems.com/index.aspx

It looks like it only works on their drives as it crashes as soon as I try to open it....but maybe that is because I'm using 7/ 64.

It's only a small download. (97kb) I don't want to attach it here for copyright reasons. That said it takes a couple of seconds to register to be able to get your hands on it.

Anyone else fancy giving it a shot?

**Ao1** · 04-10-2009, 12:29 AM

Originally Posted by m^2

1. Home users don't write 100 or even 20 GB every day. I doubt that more than 0.01% write more than 1 GB on average.
2. 10K writes per cell means that 98% cells with last at least that long. Some longer, even much longer. In case of Texas Memory Systems SLC up to 5 times declared life and overall more than what's stated.
3. this and this.
4. I wouldn't trust Intel marketing claims referring to competing products.

Here is a quote from Intel:

"there are tradeoffs when you design with MLC, one of which is reliability. An SLC device has an order-of-magnitude greater cycling capability. Winslow noted that the standard number of cycles for SLC NAND flash device can attain is 100,000 cycles (a program and erase is one cycle), whereas in MLC device only achieves ~ 5000-10000 cycles. "So when you put an MLC device into a computing solution, and if you kept everything constant, the SLC SSD gives you 10 × greater write cycling before the device wears out,"

Winslow told SST that Intel was able to achieve at MLC SSD device with high performance and long life. "We can provide at least a five-year useful life in a consumer notebook and that specification far exceeds the user model that we're benchmarking against," he said. "We've never seen a user write more than 20GB/day every day for five years. Our MLC SSD can allow a user to write 100GB/day every day for five years without wearing out the drive, so we've been able to establish that our SSDs - whether SLC or MLC - far exceed the reliability required by the usage models of the end applications. "

Key to achieving performance commensurate with the needs of mobile and PC applications is that the architecture of Intel's SATA SSDs uses a full 10 channels of flash through the SATA bus, vs. four or six channels used by other SSD manufacturers, Winslow told SST. "So we've just expanded the" freeway "of data travel to a full 10 lanes." But that creates another challenge, he said: how to manage that massive throughput and handle the firmware to respond to the host that is continually asking for a wide variety of read / writes - small, large, sustained, random, and different block sizes , etc.?

The software and firmware used in the Intel SSDs moves the blocks around such that no block gets more worn out than any other - "very similar to rotating the tires on your car," Winslow explained. "We move the blocks to ensure that all of them are evenly worn to within a 4% delta. It's a very complicated mechanism in the actual SSD controller to do this." The other piece of the solution that Intel uses is the principle of write amplification. "We've Figured out a way to program and erase just what is necessary (in small bytes rather than large blocks) so our drive has tremendous efficiency for the life of the computing environment."

This is what the Intel MLC data sheet says:

3.5.4 Minimum Useful Life

A typical client usage of 20 GB writes per day is assumed. Should the host system attempt to exceed 20 GB writes per day by a large margin for an extended period, the drive will enable the endurance management feature to adjust write performance. By efficiently managing performance, this feature enables the device to have, at a minimum, a five year useful life. Under normal operation conditions, the drive will not invoke this feature.

It would seem that Intel have introduced a mechanism that is going to make it very hard to exceed 20 GB writes per day. Compare this to the Intel SLC data sheet it simple states:

3.5.4 Write Endurance
The drive supports 1 petabyte of lifetime random writes.

**Ao1** · 04-10-2009, 01:02 AM

Originally Posted by Halk

If Intel is suggesting their write amplification is 1.1 compared to everybody else at 10, then I simply don't believe them. The minimum possible is 1. Suggesting that Intel have managed to make theirs 100 times better than the competition sounds a little bit optimistic on Intel's part.

Can't remeber where I picked this up, but it's interesting.

All drives have a worst case limit when the drive is nearly full. In this case, the write of a single 512byte logical block will result in at least one NAND page being written. With a page size of 4K the write amplification must, out of necessity, be 4kbytes/512bytes, for a write amplification of 8. However, most SSD vendors report something much closer to one, which would be the case for an empty drive and larger data transfer sizes.

**One_Hertz** · 04-23-2009, 06:45 AM

Originally Posted by audienceofone

It would seem that Intel have introduced a mechanism that is going to make it very hard to exceed 20 GB writes per day. Compare this to the Intel SLC data sheet it simple states:

3.5.4 Write Endurance
The drive supports 1 petabyte of lifetime random writes.

That's interesting because 1 petabyte would mean about 3Xk writes on average to every cell on the 32GB drive (assuming perfect wear leveling). This is also the official Intel stat so it is very likely heavily exaggerated and it is still far from the 100k value that everyone seems to be buzzing about. This is on an SLC drive that is highly optimized for server work/longevity...

If you project these stats onto a 30GB consumer MLC drive, it is not very pretty. 1 petabyte divided by 10 due to MLC/SLC differences, divided by some intel exaggeration factor, divided by a higher amount of write amplification and poorer wear leveling algorithms of the consumer drive...

**Ao1** · 04-23-2009, 07:12 AM

Just like the deceptive way that SDD's are sold on sequential/ read writes you can guarantee that write amplification specs are based on empty drives and large data transfers.

It will be interesting to see if the Vertex EX states what its write endurance specs will be, especially if it is based on the Vertex MCL controller.

**Anemone** · 04-23-2009, 12:08 PM

Since these drives are small in size - it'll be interesting to see the lifetime of moderately full drives over time in end users hands. Love the speed of SSD's but still leery of the durability claims

**Ao1** · 04-24-2009, 03:30 PM

Just re-read an old Tom's review of the X25-M. I hadn't picked up before that the DRAM was only for write amplification control. Same for the E.

This is the first flash SSD to implement NCQ, as it typically doesn’t make too much sense on a drive that allows direct access anyway. However, the DRAM buffer is not there to increase performance or to service NCQ for performance reasons. Rather, it is necessary to support write amplification control, which essentially is Intel’s attempt to improve performance while increasing life expectancy.

**Ao1** · 04-29-2009, 02:29 AM

Picked this up from a German web site. Its readable once translated.

"MLC is significantly more vulnerable to voltage fluctuations, and in spite of ECC errors are not excludable. The cause lies in the different structure of MLC and SLC memory and particularly in the materials used. The probability of an MLC error is three times higher than for SLC."

It goes on to imply that voltage fluctuations detrimentally effect the controller, so that is a double whammy to longevity.

http://209.85.229.132/translate_c?hl...WQ7A6Q8ZB7Azsw

**m^2** · 05-05-2009, 09:22 AM

Originally Posted by audienceofone

The probability of an MLC error is three times higher than for SLC."

Really, it's not that simple. You can find good info about NAND error rates here:
http://www.developersolutions.org/fo...hite_Paper.pdf

**Ao1** · 05-05-2009, 10:44 AM

Originally Posted by m^2

Really, it's not that simple. You can find good info about NAND error rates here:
http://www.developersolutions.org/fo...hite_Paper.pdf

Thank you for the link I enjoyed reading it. I also found out a bit more on the issue here: http://www.storagesearch.com/ssd-slc-mlc-notes.html

I did not want to sound alarmist in what I wrote as I agree there is a lot of advanced technology in SSD to combat these issues.

**Telperion** · 05-05-2009, 03:58 PM

I'm judging this information based on my Acronis True Image recovery partition. I have it set to run nightly incremental backups of my OS. It backs up everything that has changed (Been written) since the last image was created. Since April 18th, I only have 13 GB of backups. Extrapolating, that's roughly 20GB a month, generously. Factor in everything that might have been written but didn't change (Benchmarks, or whatever it may be), maybe 100GB in a month. That's still 3-4 GB a day. Lifespan shouldn't be a concern for most people.

**bigretard21** · 05-05-2009, 04:48 PM

Originally Posted by Telperion

I'm judging this information based on my Acronis True Image recovery partition. I have it set to run nightly incremental backups of my OS. It backs up everything that has changed (Been written) since the last image was created. Since April 18th, I only have 13 GB of backups. Extrapolating, that's roughly 20GB a month, generously. Factor in everything that might have been written but didn't change (Benchmarks, or whatever it may be), maybe 100GB in a month. That's still 3-4 GB a day. Lifespan shouldn't be a concern for most people.

That's a good point, but don't forget that Acronis applies compression during the creation of each image. Depending on the types of files that are changing, and therefore being backed up in each incremental image, you will see different degrees of compression. But your point is good and gave me at least a ballpark for my company's (12 employees) small business server which hosts a few sql databases, is a file server and hosts a lightly used website. My daily incremental images usually amount to 600MB. I think our X-25M SSD that does all that will long be replaced before it hits the MTBF.

Besides, i'd have all those backup images anyway , right Telperion

As a side note, thanks to those who post in the Storage forum as it helped me ultimately decide on a single X25M for our server instead of upgrading the old hard drives and purchasing a hardware controller for our then software based RAID 10. In the end, it seemed a single SSD would be as or more dependable than the software RAID array and leave less to deal with when it came to performance, noise, heat, electricity, drive and card cost, drives failing and my favorite: accidentally pulling out the wrong power cord causing the raid to freak.

**Rhys** · 05-06-2009, 01:55 AM

Originally Posted by bigretard21

Thanks to those who post in the Storage forum as it helped me ultimately decide on a single X25M for our server instead of upgrading the old hard drives and purchasing a hardware controller for our then software based RAID 10. In the end, it seemed a single SSD would be as or more dependable than the software RAID array and leave less to deal with when it came to performance, noise, heat, electricity, drive and card cost, drives failing and my favorite: accidentally pulling out the wrong power cord causing the raid to freak.

Lol!

Telperion did you use use compression for your images?
Even if you did it proves the average user won't have to even think about the endurance of his flash, no need to move page files, temp files, IE files and the like.

People seem more concerned about write endurance than controller failure which is mostly if not the only cause of actual SSD failures todate.

What we really need is the MTBF of SSD controllers!
That's where I think the biggest likely hood of catastrophic data loss is.

But again this issue does not concern me anymore than HDD failure.

**Ao1** · 05-06-2009, 04:31 AM

Originally Posted by Telperion

I'm judging this information based on my Acronis True Image recovery partition. I have it set to run nightly incremental backups of my OS. It backs up everything that has changed (Been written) since the last image was created. Since April 18th, I only have 13 GB of backups. Extrapolating, that's roughly 20GB a month, generously. Factor in everything that might have been written but didn't change (Benchmarks, or whatever it may be), maybe 100GB in a month. That's still 3-4 GB a day. Lifespan shouldn't be a concern for most people.

Not really relevant if you are only at 20GB writes per month....but out of interest does this not take into account writes that occur between the back up? For example if you work with Photoshop does it take into account the writes/ saves that occur during the day or is it just taking snap shot of the end changes before backup?

If I use the Win 7 resource monitor it tells me that the system generates 1.5GB of writes per week with no process running in the background if I left the pc on 24/7...which I don't.

Either way I guess it's kind of academic to what a good SSD can accommodate. As others have stated the controllers seem to be the more likely cause of failure and maybe that is why warranties are limited in comparison to HDD.

For sure MTBF is of absolutely no use to determine failure risk for SSD other than possibly for server applications. I've seen different suppliers quoting up to 500,000 hours difference in MTBF based on the same controller design. High MTBF backed with a short warranty = no confidence in how the MTBF was calculated.

EDIT: A quote from Michael Yang/ Samsung "when failures do occur, they typically occur in the controller silicon, not in the flash device itself."

And another quote from Yang: "a pattern could be perpetually repeated in which a 64GB SSD is completely filled with data, erased, filled again, then erased again every hour of every day for years, and the user still wouldn’t reach the theoretical write limit. He added that if a failure ever does occur, it will not occur in the flash chip itself but in the controller."

**Telperion** · 05-06-2009, 06:21 PM

My 20GB figure doesn't take into account everything written up to the final snapshot at the end of the day. That's why I'm estimating 100 GB a month, which is 5 times my Acronis images. Either way, the lifespan of the cells themselves is way beyond plenty for any consumer user.

**Rhys** · 05-07-2009, 03:43 AM

Originally Posted by audienceofone

EDIT: A quote from Michael Yang/ Samsung "when failures do occur, they typically occur in the controller silicon, not in the flash device itself."

And another quote from Yang: "a pattern could be perpetually repeated in which a 64GB SSD is completely filled with data, erased, filled again, then erased again every hour of every day for years, and the user still wouldn’t reach the theoretical write limit. He added that if a failure ever does occur, it will not occur in the flash chip itself but in the controller."

**One_Hertz** · 05-07-2009, 10:30 AM

Originally Posted by Telperion

I'm judging this information based on my Acronis True Image recovery partition. I have it set to run nightly incremental backups of my OS. It backs up everything that has changed (Been written) since the last image was created. Since April 18th, I only have 13 GB of backups. Extrapolating, that's roughly 20GB a month, generously. Factor in everything that might have been written but didn't change (Benchmarks, or whatever it may be), maybe 100GB in a month. That's still 3-4 GB a day. Lifespan shouldn't be a concern for most people.

Windows makes tiny writes every second. A single 1 byte write onto a untrimmed SSD = a 512KB write. One write operation per second, no matter how small, is 14.4GB per day, considering your computer is on for 8 hours. And this is if you did NOTHING on your computer.

**Ao1** · 05-12-2009, 05:01 AM

Originally Posted by annihilus

It is hard to believe that if you can kill a mcl drive in a couple of days nobody has tried it yet:

Someone has and it took a matter of days.

http://translate.google.com/translat...istory_state0=

To put things in perspective however the author of the JSMonitor, Dr. Lansen, has informed me that no one using the JMF602-SSDs controller (OCZ Core etc) in normal conditions has got to the end of the maximum erase count, but this is primarily because people have simply stopped using them due to their bad performance.

He has also informed me that in the Japanese community several users reported that their MTRON-SSDs (SLC) had failed and this was due to controller failures, not the maximum erase count being exceeded.

Dr. Lansen was kind enough to send me two white papers concerning why the controller might fail, one of which was produced by NASA (!)

It seems quite clear now that the limited warranties offered for SSD is nothing to do with concerns about the maximum erase count. If anyone wants a copy of the white papers send me a PM with your email address and I will send you copies.

**One_Hertz** · 05-12-2009, 05:28 AM

Thanks for that link. Kind of confirms the whole 10k erase cycles is a bunch of crap. Well, not exactly, it is as I have said before - the 10k is for 98% of the cells. When a single cell fails a whole block seems to be getting reallocated so the life of the SSD essentially depends on the weakest cells. In the above case those cells lasted only around 2700 cycles. NAND itself is basically all the same so this goes for all MLC SSDs...

Trim will help life expectancy quite a lot by reducing write amplification.

**Ao1** · 05-13-2009, 02:17 AM

Originally Posted by One_Hertz

Thanks for that link. Kind of confirms the whole 10k erase cycles is a bunch of crap. Well, not exactly, it is as I have said before - the 10k is for 98% of the cells. When a single cell fails a whole block seems to be getting reallocated so the life of the SSD essentially depends on the weakest cells. In the above case those cells lasted only around 2700 cycles. NAND itself is basically all the same so this goes for all MLC SSDs...

Trim will help life expectancy quite a lot by reducing write amplification.

That is an interesting point and it is maybe why Intel are so confident about endurance/low write amplification. Winslow states in a quote posted above “We've figured out a way to program and erase just what is necessary (in small bytes rather than large blocks) so our drive has tremendous efficiency for the life of the computing environment."

Edit: I’m not sure if Winslow is making that statement based on the fact that Intel use a 128K block size as opposed to a 512 block size that is used by others. A 128K block erase on a 1K write is significantly better than a 512k erase so maybe that is what he is referring to, but if he means that only part of the 128k block get erased that is even more significant.

Thread: The life expectancy of SSD's

Thread Tools

Search Thread

Rate This Thread

Display

Bookmarks

Bookmarks

Posting Permissions