PDA

View Full Version : Raid 0 - Impact on NAND Writes



Ao1
11-05-2011, 07:01 AM
Here I experiment with raid 0 to see the impact on writes to nand with a 128K and 4K Stripe.

The Set Up
Frankenstein R0
Vertex 2 and Vertex 3
User capacity 74.5GiB
Test files size: 74.43 GiB (4MB block size)
Work load: 4K random full span 100% incompressible
Drive in steady state

Stripe: 128K SMART readings before starting:

Vertex 2
E9 41216
EA 44672

Vertex 3
E9 41600
EA 41552

SMART readings after:

Vertex 2
E9 42112
EA 44864

Vertex 3
E9 41923
EA 41726

Amount of data written = 279,135.94MiB + 76,216.32MiB = 355,352.26MiB/ 347.02GiB

Nand Writes to Vertex 2 = 896GiB
Nand Writes to Vertex 3 = 323GiB
Total writes to Nand = 1,219GiB

Host Writes to Vertex 2 = 192GiB
Host Writes to Vertex 3 = 174GiB
Total writes to Host = 366GiB

Ratio between nand writes and host writes = 1:0.30
Speed at end of run = 10.03MB/s

Stripe: 4K SMART readings before starting:

Vertex 2
E9 42,112
EA 44,864

Vertex 3
E9 41,936
EA 41,734

SMART readings after:

Vertex 2
E9 43072
EA 44992

Vertex 3
E9 42,252
EA 41,903

Amount of data written = 268,862 MiB + 76,216.32MiB = 345,078.32MiB/ 336.99GiB

Nand Writes to Vertex 2 = 960GiB
Nand Writes to Vertex 3 = 316GiB
Total writes to Nand = 1,276GiB

Host Writes to Vertex 2 = 128GiB
Host Writes to Vertex 3 = 169GiB
Total writes to Host = 297GiB

Ratio between nand writes and host writes = 1:0.23
Speed at end of run = 12.75MiB/s

Stripe: 128K SMART 4k random full span. 50% OP - readings before starting:

Vertex 2
E9 44,096
EA 45,760

Vertex 3
E9 43,131
EA 42,670

SMART readings after:

Vertex 2
E9 44,224
EA 45,888

Vertex 3
E9 43,268
EA 42,789

Amount of data written = 206,573.44MiB + 3,7785.6MiB = 238.63GiB

Nand Writes to Vertex 2 = 128GiB
Nand Writes to Vertex 3 = 137GiB
Total writes to Nand = 265GiB

Host Writes to Vertex 2 = 128GiB
Host Writes to Vertex 3 = 119GiB
Total writes to Host = 247GiB

Ratio between nand writes and host writes = 1: 0.93
Speed at start of run = 48.95MB/s
Speed at end of run = 37.71MB/s

ASU Endurance App (Client not enterprise workload) Drives SE’d. Stripe: 128K SMART readings before starting:

Vertex 2
E9 43,392
EA 45,248

Vertex 3
E9 42,529
EA 42,170

SMART readings after:

Vertex 2
E9 43,648
EA 45,440

Vertex 3
E9 42,717
EA 42350

Amount of data written = MiB = 356.13GiB

Nand Writes to Vertex 2 = 256GiB
Nand Writes to Vertex 3 = 188GiB
Total writes to Nand = 444GiB

Host Writes to Vertex 2 = 192GiB
Host Writes to Vertex 3 = 180GiB
Total writes to Host = 372GiB

Ratio between nand writes and host writes = 1:0.84
Speed at end of run = 66.34 MiB/s


Single Drive - Vertex 2 4k random full span. 14.65GB user space, 22.62GB OP- readings before starting:

E9 44,224
EA 45,888

SMART readings after:

E9 44,480
EA 46,080


Amount of data written = 206,555.08MiB + 14,899.2MiB = 216.26GiB

Nand Writes to Vertex 2 = 256GiB
Host Writes to Vertex 2 = 192GiB

Ratio between nand writes and host writes = 1:0.75
Speed at end of run = 31.12 MB/s

Single Drive Vertex 3 4k random full span no OP- readings before starting:

E9 43,268
EA 42,789

SMART readings after:

E9 44,010
EA 42,926

Amount of data written = 82,850.78MiB + 57139.20MiB = 136.70GiB

Nand Writes to Vertex 2 = 742GiB
Host Writes to Vertex 2 = 137GiB

Ratio between nand writes and host writes = 1:0.18
Speed at end of run = 31.12 MB/s

Summary

http://img253.imageshack.us/img253/2717/graphfinal.png (http://imageshack.us/photo/my-images/253/graphfinal.png/)

TRIM is not effective if you have a 4K random full span work load.

Over provisioning had a significant impact on reducing wear whilst at the same time increasing write speeds

With a R0 4K array it took around 431GB of writes for the drive to get to a steady state when running the ASU client workload. Write speeds progressively fell by 26%. (See post #20)

When running the 4K random write work load, write speeds dropped by 50% after 2 minutes and started to fluctuate. After a further 4 minutes writes speeds dropped by 50% again and fluctuated more. (See post #42)

Read speeds were significantly impacted when the drive was in a degraded state. (134MB/s on a R0 array) Steady state performance came out at around 290MB’s for reads.

Computurd
11-05-2011, 07:16 AM
could that be throttling at all? doiubt it with the low usage so far though. Very interesting testing! subbed :)

Ao1
11-05-2011, 09:48 AM
100,000 MiB and write speed is now 8.34mb/s. (I don't think it has anything to do with LTT).

Computurd
11-05-2011, 10:20 AM
wow that is terrible tbh. Now, do you feel that might have something to do with the inability of SF to handle incompressible data effectively, or just reflective of results with most SSDs?

There is an effect on the performance of the write speed of typical SSDs during long testing scenarios with writes, that is where enterprise devices shine, they are truly optimized for it. Here is an article, and a snip, that i have had in my read list for awhile now. I am going to do some variant of this type of testing for a future review of lightning 6gb/s devices.

This is with single devices, the raid array responses will be vastly different of course.
www.itworld.com/storage/187659/review-enterprise-solid-state-drives

http://i517.photobucket.com/albums/u332/paullie1/zzzz-2.png

Ao1
11-05-2011, 10:41 AM
I’m using a special enterprise version of Anvil’s app. :up: The write speed seems to have levelled out now. This is a torture test for sure. Without TRIM or idle time the drive is incurring the max read-modify-write penalty. A single drive with TRIM did not behave this way.

I don’t know if I could bring myself to inflict this kind of test on my Intel’s. On the other hand it would be interesting to see how they performed in a R0 set up.

The real reason to do this however is to see how many writes end up on each SSD. When I’ve done with 4K I’ll run the normal version of Anvil’s app to see if a mainly sequently loads makes any difference.

Anvil
11-05-2011, 01:46 PM
There will be an option to fill the drive using 4KB blocks (instead of 4MB) and that really makes a difference for some of the drives.

Ao1
11-05-2011, 01:52 PM
Write speed has improved a little. It's now 10.08mb/s with 268,750MiB of data. As it is taking so long I'm tempted ot break up the array and check the SMART values

Anvil
11-05-2011, 01:55 PM
That would be interesting :)

Please do a HDTune Benchmark using 4MB block size, seq reads should have taken quite the dive vs in fresh state.

Ao1
11-05-2011, 01:58 PM
Any special settings on HD Tune? Full or partial? Fast or accurtate?

Anvil
11-05-2011, 02:01 PM
Just the 4MB block size (found in settings) and a full span read test -> Benchmark tab

Ao1
11-05-2011, 02:12 PM
HD Tune said it would not run with an active partition so I deleted it but it still would not run. After reinstating the partition AS SSD gave me:
Write 14.06
Read 134.00 :shocked:

Going to break the array up now to check the writes

http://img513.imageshack.us/img513/150/assssd.png (http://imageshack.us/photo/my-images/513/assssd.png/)

http://img692.imageshack.us/img692/5069/asup.png (http://imageshack.us/photo/my-images/692/asup.png/)

Anvil
11-05-2011, 02:23 PM
The reads are just as important (if not more so) as the writes and the test shows that it's highly degraded, I'd say 134MB/s is as expected from such an exercise. (unfortunately)

HDTune won't let you run a write test with an active partition but the read test should work, otherwise there is something seriously wrong.

edit:
Next time you should try deleting the file from the Endurance app and then Fill the drive it using 4MB blocks and then do a HDTune read test. It should show how quickly it restores performance, reads in particular.

Ao1
11-05-2011, 02:48 PM
First post updated with results. 4K stripe comming up next (after a couple of HD Tune read benchmarks).

some_one
11-05-2011, 06:19 PM
As it is taking so long I'm tempted ot break up the array and check the SMART values
Can you not read SMART while in RAID0?

Ao1
11-06-2011, 02:25 AM
Now running with a 4K stripe. I let the drive idle for a few hours and formatting the drive would have generated a trim operation, but the write speeds were still heavily degraded when I started the test. The sequential test file speed came out at 15.41MB/s (compared to 100.34MB/s when I ran the 128K stripe), although the 4K stripe seems to work a bit better with the 4K work load. 100,000MiB in and its sitting at 13.75MB/s (compared to 8.34MB/s with the 128K stripe).

The graph shows the slowdowns that occur as the drive tries to clear a path ahead of writing.

When I’m done with this run I will S/E the drives and then see how long it takes to get to a fully degraded state.


Can you not read SMART while in RAID0?

Sadly no and I can’t find a way to get around it.

http://img35.imageshack.us/img35/4043/asut.png (http://imageshack.us/photo/my-images/35/asut.png/)

http://img403.imageshack.us/img403/8116/unledbxt.png (http://imageshack.us/photo/my-images/403/unledbxt.png/)

some_one
11-06-2011, 06:32 AM
Have you tried smartmontools (http://sourceforge.net/projects/smartmontools/files/smartmontools/5.42/smartmontools-5.42-1.win32-setup.exe.md5/download)?

What does running "smartctl --scan" show?

Ao1
11-06-2011, 06:33 AM
First post updated. The 4K strip was marginally faster, but the ratio between nand/host writes was worse, which may have been attributable due to the fact that the drive was in a degraded state on this run.

Drives S/E’d and running 4K random to see how long it takes for the drive to degrade. (It started to degrade after 5 minutes. Currently 227 seconds in and avg writes speeds are at 14MB/s.

Ao1
11-06-2011, 06:44 AM
OK, disks secure erased, reset in raid 0 with a 4K stripe.

Avg to fill the drive = 144.37MiB/s.

Fully degraded state within 362.45 seconds & 24.15GiB (+ test file 74.43GiB)

http://img210.imageshack.us/img210/9773/asu.png (http://imageshack.us/photo/my-images/210/asu.png/)

sergiu
11-06-2011, 08:09 AM
Interesting tests, yet somehow predictable. 4K writes full span with no static data should lead to WA around 10-20 and maybe even higher degradation than what you see. For such a scenario, only larger spare area should help. I believe trim is useless, because there would be no complete free block to be cleared. You could do SE and decrease partition size after to simulate extra spare area.

Ao1
11-06-2011, 08:11 AM
ASU Standard Endurance Test 12GiB free. 100% incompressible.
Total Time: 1.25 hours
R 0 4K Stripe. MB/s from fresh to steady state

http://img600.imageshack.us/img600/3976/unledwzl.png (http://imageshack.us/photo/my-images/600/unledwzl.png/)

Ao1
11-06-2011, 08:19 AM
Interesting tests, yet somehow predictable. 4K writes full span with no static data should lead to WA around 10-20 and maybe even higher degradation than what you see. For such a scenario, only larger spare area should help. I believe trim is useless, because there would be no complete free block to be cleared. You could do SE and decrease partition size after to simulate extra spare area.

I think there might have been some static data when I ran the same test on a single drive. Can’t remember but will check later, however the single drive with trim performed significantly better. Avg MB/s after 0.7TiB of data = 28 compared to 13 with a 4K stripe.
I'm running the same tests with the normal version of the endurance app now so let’s see what happens then.

Ao1
11-06-2011, 08:27 AM
Anvil, here is a HD Tune run straight after the endurance app run above. (4MB)

http://img249.imageshack.us/img249/5910/unledkx.png (http://imageshack.us/photo/my-images/249/unledkx.png/)

CedricFP
11-06-2011, 08:33 AM
Could someone put the results that have so far been observed into more layman terms? I'm having a bit of trouble following.

Ao1
11-06-2011, 09:07 AM
Could someone put the results that have so far been observed into more layman terms? I'm having a bit of trouble following.

Moved to 1st post

sergiu
11-06-2011, 10:20 AM
I think there might have been some static data when I ran the same test on a single drive. Can’t remember but will check later, however the single drive with trim performed significantly better. Avg MB/s after 0.7TiB of data = 28 compared to 13 with a 4K stripe.
I'm running the same tests with the normal version of the endurance app now so let’s see what happens then.
With static data around, ratio between spare area and usable area for 4K full span is higher which would lead also to better performance. Trimming might have helped because of the file size (4MiB which is equal to 512 or 1024 pages) possibly at the cost of higher WA. But for 100% 4K writes full span where drive would be seen as a single file in which pages are constantly overwritten, I cannot see how trim will help, as each overwritten page would free only one. But who knows, maybe with trim help, wear leveling algorithm would choose better candidates for erasing.

Ao1
11-06-2011, 10:29 AM
New results based on ASU Client added to first post.

Write distribution between the drives is a lot better, but both drives are still incurring writes in excess of the amount of data written.

Also it seems that after reaching a steady state there is a further degraded state down the line – Avg 66.34MB/s = write speeds 50% below steady state after writing ~300GiB.

http://img215.imageshack.us/img215/7002/unledjpp.png (http://imageshack.us/photo/my-images/215/unledjpp.png/)

Ao1
11-06-2011, 10:31 AM
With static data around, ratio between spare area and usable area for 4K full span is higher which would lead also to better performance. Trimming might have helped because of the file size (4MiB which is equal to 512 or 1024 pages) possibly at the cost of higher WA. But for 100% 4K writes full span where drive would be seen as a single file in which pages are constantly overwritten, I cannot see how trim will help, as each overwritten page would free only one. But who knows, maybe with trim help, wear leveling algorithm would choose better candidates for erasing.

I'm going to re-test a single drive later. I believe it might have had 50% static data on it.

Anvil
11-06-2011, 10:57 AM
I'll send you a link to a slightly adjusted app.
(some minor adjustments)

Ao1
11-06-2011, 12:42 PM
With static data around, ratio between spare area and usable area for 4K full span is higher which would lead also to better performance. Trimming might have helped because of the file size (4MiB which is equal to 512 or 1024 pages) possibly at the cost of higher WA. But for 100% 4K writes full span where drive would be seen as a single file in which pages are constantly overwritten, I cannot see how trim will help, as each overwritten page would free only one. But who knows, maybe with trim help, wear leveling algorithm would choose better candidates for erasing.

You are right. :up: I'm running the enterprise app (4K random) on the V2 & V3 as single drives with no static data and the drives are degrading just as badly as the R0 array. I'm down to 13MB/s on both drives after only 10GB of data

edit. lol, after 16GB I'm down to 10MB/s on both drives. Worse that R0. Maybe with R0 the writes are getting cached. I'm going to create a R0 array but this time I'll set aside 50% for OP.

sergiu
11-06-2011, 01:12 PM
Let it run for a few hundred's of GB, as I have the feeling the speed will stabilize to an even lower value. Also, what is WA value? For the WA, please take the readings after write speed stabilization, as I believe instant WA increases as write speed decreases. With 50% OP you should see a huge difference in write speed, but what would be more interesting to follow is average/max write latency between those states.

Ao1
11-06-2011, 01:13 PM
Stroll on. R0 128K stripe 50% OP 4K random "full" span

http://img638.imageshack.us/img638/8763/unledwyi.png (http://imageshack.us/photo/my-images/638/unledwyi.png/)

Ao1
11-06-2011, 01:16 PM
^ I'll go back to the single drives later as they will take forever at that speed. (and I'll monitor avg/ max latency as well. I already know its going to get very high).

R0 128K stripe 50% OP 4K random "full" span

http://img810.imageshack.us/img810/1671/unlednioc.png (http://imageshack.us/photo/my-images/810/unlednioc.png/)



Uploaded with ImageShack.us (http://imageshack.us)

sergiu
11-06-2011, 02:45 PM
What QD is used for generating 4K writes? If it is 1, then it would be also interesting to see what happens with higher QD values, although I suspect aggregated speed will not increase

Ao1
11-06-2011, 02:53 PM
First post updated. Using 50% OP transformed the outcome. Hard to believe it had that much impact.

http://img20.imageshack.us/img20/3344/endd.png (http://imageshack.us/photo/my-images/20/endd.png/)


@ Sergiu I believe it is all qd1

Ao1
11-07-2011, 02:09 AM
Vertex 2 - 4K random - 14.65GB user space, 22.62GB OP

http://img9.imageshack.us/img9/1093/86703493.png (http://imageshack.us/photo/my-images/9/86703493.png/)

Meanwhile the Vertex 3 is struggling. 4K random – no OP

http://img42.imageshack.us/img42/9081/32084915.png (http://imageshack.us/photo/my-images/42/32084915.png/)

sergiu
11-07-2011, 03:30 AM
At least is still much faster than a RAID array based on normal HDDs. Yet the difference is impressive. Luckly there is no "normal" usage pattern that would lead to something like that

alfaunits
11-07-2011, 03:39 AM
OT: I'd be quite worried about "technological progress" if a newer (Vertex 3) SSD of the same company (OCZ) is outperformed heavily by an earlier generation SSD (Vertex 2).

Ao1
11-07-2011, 03:46 AM
OT: I'd be quite worried about "technological progress" if a newer (Vertex 3) SSD of the same company (OCZ) is outperformed heavily by an earlier generation SSD (Vertex 2).

It’s not. The difference is over provisioning. I can't believe how much difference it makes. It’s as if the work load was converted to sequential. Now I can see why TRIM for raid is not on the top of developers list. With a random work load it makes no difference. With raid you get cache to help alleviate random writes, plus with OP it doesn’t matter anyway.

I’m finishing off the 4K random on the V3 and then I will run again but with static data. Let’s see how good the wear levelling is. From initial testing it’s not as good as OP, but it’s not far behind.

Ao1
11-07-2011, 04:08 AM
I had to reboot before I got to 200GB :( First post updated.

http://img210.imageshack.us/img210/3084/unledty.png (http://imageshack.us/photo/my-images/210/unledty.png/)

sergiu
11-07-2011, 04:41 AM
Wow... so speed dropped even further. Could you look for WA for next 50-100GB? Now write speed should have been stabilized. Also, there might be a difference in speed in favour of Vertex 2 because of NAND geometry in worst case scenario. Having 512KB blocks instead of 2MB ones (or 1MB) might help a little when recycling. However these might be negated by advancements in controller and better NAND write speed.

Ao1
11-07-2011, 05:21 AM
V2 with ~50% static data. So far holding out better when compared to no static data, but not as well as OP.

http://img97.imageshack.us/img97/650/unledbtrr.png (http://imageshack.us/photo/my-images/97/unledbtrr.png/)

Ao1
11-07-2011, 06:19 AM
The graph is lacking in granularity, but it was the only way I could find to show the stepped drop in performance that occurs with 4k random (Vertex 3, no OP, full span on a SE’d drive). Write speeds are consistent for the first 2 minutes and then they drop by around 50% but fluctuate below. After 6 minutes write speeds fluctuate considerably. On avg I’d say another 50% drop.

http://img405.imageshack.us/img405/6473/unledfqz.png (http://imageshack.us/photo/my-images/405/unledfqz.png/)

Ao1
11-07-2011, 06:27 AM
Here is a direct comparison between a drive with no static data and an empty drive. (tests started around the same time) Having static data certainly helps, but it’s not as good as OP.

http://img804.imageshack.us/img804/8857/unledzhn.png (http://imageshack.us/photo/my-images/804/unledzhn.png/)

Here are the response times of the V3 taken as the drive started to degrade. 83.44% of writes below 10MB/s and 41.28% of (4K) xfers between 500us and 1ms. I’ll check if I get time but I think the V3 is much better with response times compared to the V2.

http://img267.imageshack.us/img267/4620/10262643.png (http://imageshack.us/photo/my-images/267/10262643.png/)

Anvil
11-07-2011, 09:05 AM
Getting really interesting :)

You should be able to read SMART even if the drives are in raid, I can read my m4's using the Intel SSD Toolbox, will try a small array of SF drives later.
(the question is will all attributes display or only the ones that Intel uses?)

Ao1
11-07-2011, 09:53 AM
So with 50% static data long term performance is double and writes to nand are reduced. I'm wondering how this fits into what could be observed with the B1 values on SymbiosVyse's drive (that had no static data).

http://img171.imageshack.us/img171/6109/unledjrt.png (http://imageshack.us/photo/my-images/171/unledjrt.png/)


Getting really interesting :)

Thanks Anvil, this has been quiet hard work. (Time intensive) It's been a bit of an eye opener for me though so I'm glad I did it.

Ao1
11-07-2011, 12:09 PM
Wow... so speed dropped even further. Could you look for WA for next 50-100GB? Now write speed should have been stabilized. Also, there might be a difference in speed in favour of Vertex 2 because of NAND geometry in worst case scenario. Having 512KB blocks instead of 2MB ones (or 1MB) might help a little when recycling. However these might be negated by advancements in controller and better NAND write speed.

There you go. (Ignore the elapsed time as I had to pause briefly a few times.).
Ratio between nand writes and host writes = 1:0.148 (nasty)

http://img190.imageshack.us/img190/8123/unlediiw.png (http://imageshack.us/photo/my-images/190/unlediiw.png/)

Ao1
11-07-2011, 12:16 PM
The reads are just as important (if not more so) as the writes and the test shows that it's highly degraded, I'd say 134MB/s is as expected from such an exercise. (unfortunately)

HDTune won't let you run a write test with an active partition but the read test should work, otherwise there is something seriously wrong.

edit:
Next time you should try deleting the file from the Endurance app and then Fill the drive it using 4MB blocks and then do a HDTune read test. It should show how quickly it restores performance, reads in particular.

This is on a single V3. When I stopped the 4K writting I deleted the test file and started up HD Tune. Looks like a TRIM operation cleared up drive to allow read speeds to return.

http://img819.imageshack.us/img819/1926/90845653.png (http://imageshack.us/photo/my-images/819/90845653.png/)

http://img834.imageshack.us/img834/7138/hiom.png (http://imageshack.us/photo/my-images/834/hiom.png/)

(Edit: Note only one IOP for the TRIM command. That is why the drive appears to hang. The hole drive is cleaned up in one go)

Anvil
11-07-2011, 12:25 PM
It cleans up nicely, 20-21 seconds isn't that bad for a full TRIM.

I expect this is on a 3Gb/s controller?
6Gb/s will probably show marginally better results.

Ao1
11-07-2011, 12:29 PM
Yep it’s a 3Gb/s controller. Still for 4k, bandwidth limits are not such a problem :D (Who knows for how much longer. Maybe its not so far away).
I was going to update to the X79, but I’m not so sure now. Might wait another gen. :rofl:

sergiu
11-07-2011, 12:50 PM
There you go. (Ignore the elapsed time as I had to pause briefly a few times.).
Ratio between nand writes and host writes = 1:0.148 (nasty)

That's quite a good ratio for such a test, much better than what I expected! I would expected something like 1:0.1-1:0.03. But, most probably you need to know fine details of the wear leveling algorithm to design something against it.

Anvil
11-07-2011, 12:55 PM
The X79 refresh might be the one to wait for :)

I just checked and the Intel SSD Toolbox does read the attributes correctly while the drives are in raid.

As Intel counts every value in F1/F2 as 32MB you need to convert the value shown by the Toolbox.
In my case F1 shows as 22GB and multiplied by 32 it equals 704 which is the value shown by CDI and OCZ Toolbox.

http://www.ssdaddict.com/ss/smart_V2.JPG
http://www.ssdaddict.com/ss/smart_v2_cdi.JPG

Ao1
11-07-2011, 01:23 PM
Here is a chart to summerise the results of testing the ratio between Nand & Host Writes

http://img253.imageshack.us/img253/2717/graphfinal.png (http://imageshack.us/photo/my-images/253/graphfinal.png/)

Anvil
11-07-2011, 01:32 PM
That chart shows exactly why Intel focuses on over-provisioning for Enterprise applications :)

(except for the last entry, what's the difference between the last 2 entries?)

Ao1
11-07-2011, 01:36 PM
^ fixed

Anvil
11-07-2011, 01:51 PM
That explains it (no OP)

Now, if you only had more drives for testing purposes, where can we find some?
In particular the Intel controller and the Marvell controller would have been very interesting vs the SF controller.
(also, the new Samsung controller looks strong)

some_one
11-07-2011, 07:04 PM
I just checked and the Intel SSD Toolbox does read the attributes correctly while the drives are in raid.


I did suggest some posts back to try smartmontools, don't know if anyone did try it. Although I'm not that familiar with smartmontools as it's easier for me to use my own very untidy software to read SMART through RAID, smartmontools does seem to work okay on my RAID system and it also provides logging SMART values to file at intervals with the proper command line / config file.

Ao1
11-08-2011, 02:57 AM
Intel’s toolbox can indeed read the SMART values for OCZ drives in raid and the values appear the same as what the OCZ toolbox or any other SMART reader would present. :cool: I wasn’t too worried about reading the array as I had to break it up to SE the drives between different tests. That said it’s very handy to be able to read values without splitting the array. I'm working on read speeds now.

Ao1
11-08-2011, 03:54 AM
R0 128K stripe. User capacity reduced from 74.5GiB to 49.9GB to provide ~25GB of OP
4K full span writes – 146,725 MiB
Delete test file (4MB blocks)
Run HD Tune
Now refilling with 4MB blocks to recheck read speeds.

http://img88.imageshack.us/img88/6812/unledaia.png (http://imageshack.us/photo/my-images/88/unledaia.png/)

Ao1
11-08-2011, 04:24 AM
Refilled array with 4MB blocks. Deleted and then re-ran HD Tune.
Write speeds to fill the array with 4MB blocks came out around ~39MB/s, which is very slow compared to 144.48MiB/s when the drives are in a fresh state.

http://img573.imageshack.us/img573/6121/unleddsx.png (http://imageshack.us/photo/my-images/573/unleddsx.png/)

Ao1
11-08-2011, 04:40 AM
3rd time - Refill array with 4MB blocks. Delete and then re-run HD Tune.
Write speeds to fill the array with 4MB blocks came out around ~80MB/s.

http://img804.imageshack.us/img804/7383/unledpv.png (http://imageshack.us/photo/my-images/804/unledpv.png/)

Ao1
11-08-2011, 04:55 AM
4th time - Refill array with 4MB blocks. Delete and then re-run HD Tune.
Write speeds to fill the array with 4MB blocks came out around ~102MB/s.

http://img337.imageshack.us/img337/5953/unledls.png
(http://imageshack.us/photo/my-images/337/unledls.png/)

Ao1
11-08-2011, 04:56 AM
5th time – Seems to have got to as good as it can get.
Write speeds to fill the array with 4MB blocks came out around ~98MB/s.

http://img23.imageshack.us/img23/8071/unledcyt.png (http://imageshack.us/photo/my-images/23/unledcyt.png/)

Ao1
11-08-2011, 05:17 AM
R0 just after a SE. Read speed degradation sucks with SF drives.

http://img69.imageshack.us/img69/4769/unledspc.png (http://imageshack.us/photo/my-images/69/unledspc.png/)

squick3n
11-08-2011, 06:32 AM
Read speed degradation: Is that b/c of the lack of TRIM, or just the nature of SF drives?

Ao1
11-08-2011, 06:50 AM
It’s the nature of SF drives. With a single drive read speeds were reinstated more or less immediately after a TRIM operation (see post # 47).

Writing large blocks (4MB) helps to clear up fragmented blocks, but as can be seen it does not get the drive back to optimum performance.

I thought R0 would be bad for writes, but it’s actually beneficial if working with a highly random work load, especially if you have OP. Reads on the other hand is not good at all. That is just a quirk however with SF drives.

squick3n
11-08-2011, 07:05 AM
Thanks for all your work and clarification. I guess that is why the OCZforums are such proponents of secure erasing drives in an array.

some_one
11-08-2011, 08:41 AM
Ao1, what data were your 4MB blocks that you wrote consisted of?

Ao1
11-08-2011, 09:16 AM
I used Anvils Enterprise App to generate the 4MB blocks, which filled the drive before being deleted. I believe the block were uncompressible. Anvil will be able to elaborate more.

some_one
11-08-2011, 10:05 AM
Thanks.


R0 just after a SE. Read speed degradation sucks with SF drives.

If I understand correctly after SE or TRIM your not actually reading any media during reads to those pages/blocks that were Se'd or trimmed, instead your getting DRAT although on my own SF system it seems like ZRAT but according to the ATA IDENTIFY ZRAT isn't supported. Maybe they changed the spec or it was never implemented with the firmware, idk.

So idk if in that case I would call it degradation as your not reading anything from the media. I was hoping to see if a forced TRIM could be accomplished for RAID 0 but after thinking about it maybe just a zero fill would be sufficient. Since the compression ratio for zero fill is ~7:1 then filling up the deleted clusters of the array with zero fill data should give back some of the media to GC. For instance, if you were to fill your 80GB array with zero's it may only fill ~12GB of media leaving the other 68GB to be returned to GC. How long GC takes idk. What do you think?

DRAT - Deterministic Return After Trim
ZRAT - Zero Return After Trim

Ao1
11-08-2011, 01:30 PM
^ Let’s see :)

R0 with no OP. 12GB of 4K random writes over 5 minutes. That takes the controller from a fresh state to the 1st stage of degradation. (4K write speeds were still at 35.87mb/s)

Here is a HD Tune shot straight after the writes. Now I will leave the drive to idle for 24 power on hours and then I will re-run HD Tune.

http://img23.imageshack.us/img23/7889/unledjja.png (http://imageshack.us/photo/my-images/23/unledjja.png/)

Anvil
11-08-2011, 04:02 PM
What read speed do you get if you just fill the drive with 4MB blocks of incompressible data.

@some_one
The compression used is user selectable, the same settings are used throughout the application.

some_one
11-08-2011, 11:09 PM
I did a Secure Erase and created one 90GiB volume on the RAID 0 array with two partitions. W7 installed on the first partition and W8 on the second. HDTune scale is in GB not GiB.

http://img40.imageshack.us/img40/2226/seinst.png


Next I created 3x 1GiB 0-fill files from the W8 OS and written to the W7 partition. Notice the difference in read speed at the LBA locations for those files which is the same whether the files are deleted or not. As there is no TRIM pass-through to the array the flash media has to retain the data for those LBA's. If TRIM worked I would expect those LBA's to return to 900MB/s on read if there are no writes to them.

http://img15.imageshack.us/img15/6342/3g0fill.png


I also ran 9x 1GiB FF-fill files, similar results.

http://img444.imageshack.us/img444/4838/9gffill.png


Should be interesting to see your test of write degradation and if filling the volume with 0-fill data does free up media to be used in the free pool.
Write speeds might start off high then drop off as the free media is depleted. I guess with a bigger pool of free media that will increase the time before degradation takes place. On a normal system that would mean longer bursts of high write speed with idle time used to replenish the pool. How much that pool can be replenished will depend on how many different clusters have been dirtied (written too) and the compression ratio.

Well, the theory is one thing but I wonder if it is like that in practice. :shrug:

Anvil
11-09-2011, 03:16 AM
While working on this part of the utility I played with different controllers and from what I recall the m4 cleaned up nicely (in raid) just by writing sequentially.
I'll try to find the screen-shots later today.

Ao1
11-09-2011, 06:46 AM
So no one spotted my deliberate mistake? I benched one of my X25-M’s by mistake in post #70. :shakes:

Fresh start. SE’s drives R0. No OP

HD Tune straight after a SE with no data on the drive.
http://img683.imageshack.us/img683/2898/37632256.png (http://imageshack.us/photo/my-images/683/37632256.png/)

HD Tune straight after filling the drive with 4MB blocks. :shocked: Wow, didn’t expect that. A 47% drop in read performance after writing to all LBA’s once with 4MB blocks.
http://img685.imageshack.us/img685/5523/13934114.png (http://imageshack.us/photo/my-images/685/13934114.png/)

HD Tune straight after 14,014MiB of 4K random writes
http://img607.imageshack.us/img607/1268/47824776.png (http://imageshack.us/photo/my-images/607/47824776.png/)

Delete test file. (Drive is now all free space.)
http://img411.imageshack.us/img411/1107/24913536.png (http://imageshack.us/photo/my-images/411/24913536.png/)

Now I will run HD Tune every hour to see how quickly GC works.

Ao1
11-09-2011, 07:48 AM
One hour in. Absolutely no difference in the benchmark.

Anvil
11-09-2011, 07:56 AM
I noticed the Intel but didn't think it was a mistake?

Is this a SF raid or what?
If it's SF then it can take hours, it might need to idle overnight for anything to happen, if it's Intel then I expect nothing to happen.

Are you using the app I sent you a few days ago?

Ao1
11-09-2011, 08:05 AM
I only benched the X25-M once by mistake, which was right after I got the SF R0 array prepared to monitor GC. It’s a bit confusing as The R0 array is labelled as Intel Raid 0, but it is built with SF drives.

I’m using the first version of the Endurance app you sent me, just so I have a comparative workload, but when I’ve finished monitoring GC I will play with the newer version.

some_one
11-09-2011, 08:30 AM
Okay, I'm a little confused at what is expected to happen with GC. Once you've written to all LBA's that data will stay there until overwritten as there is no TRIM at this time available using a RAID array. There might be a few pages shuffled around but I wouldn't expect to see any big changes.

Anvil
11-09-2011, 11:19 AM
some_one

As the drive is clean (all files deleted) one would expect GC to restore the speed, most of it if not all, if there is active GC, it might need to idle at some C state but it shouldn't be required in raid mode.
If there is data then nothing will happen, except if there are partial pages and cleaning (GC) combined several partial pages...

Anvil
11-09-2011, 12:21 PM
A little experiment on my boot drive.
2R0 m4 128GB with 61GB of data.

First there is an unknown state, a few days of use since last time I tested so just ordinary usage.
It does clearly show where files are stored.
edit: (more correctly it shows where data has been stored, the plateau from 96GB-236GB has never been used)

http://www.ssdaddict.com/ss/m4/1_09-November-2011_17-34.png

Then there is a 25.6GiB sequential write which leaves the array in the following state.
So, only by writing the file the array looks to be cleaner.

http://www.ssdaddict.com/ss/m4/4_09-November-2011_17-44_test_after_new_testfile.png

60 seconds of random writes

http://www.ssdaddict.com/ss/m4/5_2R0_M4_128_32KB_236GB_1GB-20111109-1746_60sec_random_writes.png

leads to this, one can clearly spot where the random writes have been written

http://www.ssdaddict.com/ss/m4/6_09-November-2011_17-50_test_after_60sec_random_writes.png

So, I deleted the testfile (the 25.6GiB file) and wrote another ~25GiB to the array, the end result is like this. (no idle time)

http://www.ssdaddict.com/ss/m4/11_09-November-2011_18-03_test_25GB_created_and_deleted.png

The m4 does not reinstate full "speed" after such an exercise but it does clean up pretty good.

johnw
11-09-2011, 12:30 PM
There might be a few pages shuffled around but I wouldn't expect to see any big changes.

That's where you'd be wrong. Most SSDs have around 7% of flash available for controller use in excess of the advertised capacity. Even when the SSD LBAs are all used, GC still has about 7% of unused flash to collect together into larger chunks.

johnw
11-09-2011, 12:43 PM
A little experiment on my boot drive.
2R0 m4 128GB with 61GB of data.

First there is an unknown state, a few days of use since last time I tested so just ordinary usage.
It does clearly show where files are stored.


It is not clear to me. I see a transition at 47GiB, and another one at 96GiB.

Since you said it has 61GB of data (56.8 GiB), I would have expected the transition at around 56.8 GiB. Since it is R0, I would expect about half of that 56.8 GiB to be written to each SSD. But even so, I would not expect to see transitions at 47GiB and 96GiB, as your data shows.

I think it is clear that the 61GB of data is not all stored at the "beginning" (LBAs) of the SSDs. But it is not clear why there are two transitions in the read speed.

Am I missing something?

Ao1
11-09-2011, 12:44 PM
I found this on Wiki http://en.wikipedia.org/wiki/Garbage_collection_(SSD)#Garbage_collection

So far no real change.

“If the controller were to background garbage collect all of the spare blocks before it was absolutely necessary, new data written from the host could be written without having to move any data in advance, letting the performance operate at its peak speed. The trade-off is that some of those blocks of data are actually not needed by the host and will eventually be deleted, but the OS did not tell the controller this information. The result is that the soon-to-be-deleted data is rewritten to another location in the Flash memory increasing the write amplification. In some of the SSDs from OCZ the background garbage collection only clears up a small number of blocks then stops, thereby limiting the amount of excessive writes. Another solution is to have an efficient garbage collection system which can perform the necessary moves in parallel with the host writes. This solution is more effective in high write environments where the SSD is rarely idle. The SandForce SSD controllers and the systems from Violin Memory have this capability”.

Anvil
11-09-2011, 12:55 PM
It is not clear to me. I see a transition at 47GiB, and another one at 96GiB.
..
Am I missing something?

Not really, there was another VM on the array which is deleted, it makes up the difference between ~96GB and ~61GB.

The part from 0-96GB consists of ~14days of activity.
Meaning that there has never been written data beyond 96GB.

edit:
I have added some info to the original post.

johnw
11-09-2011, 01:24 PM
The part from 0-96GB consists of ~14days of activity.
Meaning that there has never been written data beyond 96GB.


Thanks, that makes sense now.

I like this method you guys have been using of testing READ speed with HD Tune. At first I was skeptical of what you could learn that way, but it seems to reveal some interesting differences in how various SSDs work. I was biased against HD Tune for SSDs because of Anand's WRITE tests with HD Tune (which I think does not provide useful information, since large full-span sequential writes are so rare in actual usage). But using HD Tune for READ tests, combined with ASU to write data in various semi-realistic usage patterns, looks to provide some useful information. Well done!

squick3n
11-09-2011, 02:26 PM
I know it is a Frankenstein RAID, but are your 4k reads (with OP) faster than the 4k reads on the single V3?

some_one
11-09-2011, 06:43 PM
That's where you'd be wrong. Most SSDs have around 7% of flash available for controller use in excess of the advertised capacity. Even when the SSD LBAs are all used, GC still has about 7% of unused flash to collect together into larger chunks.

Johnw, I am aware of drive provisioning. From Anvil's post (btw thanks for that Anvil) it seems to me that there is a performance penalty for access to non-contiguous media pages and/or pages that exist on different blocks. i.e. although the LBA's on the OS side are contiguous the mapping to the flash media is scattered all over the place.

Anvil's test of sequentially writing over the LBA's on the OS side will result in a new mapping of OS LBA to media pages and as the new mapping will be to the free contiguous pages of the blocks in the provisioning pool this results in better read performance. The old blocks then are GC'd to replenish the provisioning pool.

With Ao1's test of 4k random writing this will provide a new mapping for those pages that were written, with the old page mapping being de-allocated. This could leave some blocks with quite a few de-allocated pages which GC can clean up however, if by writing such that whole new blocks are mapped and filled with valid mapping then it will still leave the mapping substantially fragmented. This is why I'm not expecting a big change. GC would effectively have to defrag those pages to bring back read performance and I am not aware of that being one of GC's functions.

Using a RAID array it shouldn't matter if the file is deleted or not but maybe as a test the file could be cut and saved somewhere then copied back, the write could then possibly result in contiguous media allocation and result in improved read performance. Just a thought.*

I have very little experience with SSD's just having recently purchased them and that experience is limited to one model so if my thinking seems way off I hope you can understand and forgive me for any misconceptions I may have come to.

*EDIT: Might be better to physically read and write the file LBA region sequentially directly from/to the physical disk in case the windows copy changes the file cluster assignment.

johnw
11-09-2011, 07:06 PM
It seems like the RAID part of many of these tests is just adding needless complexity. What is really being tested is the behavior of SSDs without TRIM. But that can be tested by simply turning TRIM off in the OS.

some_one
11-09-2011, 11:51 PM
First of all created 8GiB highly compressible 0-fill file.
Maybe a few clusters written outside the lines but IMO good enough.

http://img402.imageshack.us/img402/4475/8g0fill.png


Then 4k incompressible random writes to the 8GiB file followed by HDTune.

http://img6.imageshack.us/img6/1912/8g4k.png


Reading the first 4GiB of sectors of the file and writing them back sequentially to the same sectors followed by HDTune.

http://img16.imageshack.us/img16/5875/8g4grw.png


Probably easier to just take the difference between reading contiguous sequential writes and reading random writes to get an idea of performance degradation through fragmentation. Since fragmentation can happen on both the OS side and the media side it makes things interesting.

Ao1
11-10-2011, 12:08 AM
Nice one :) My drive has not changed one iota. I’ll leave it for a while longer and then I’ll try writing 4MB 0 fill blocks to see what happens.

Ao1
11-10-2011, 12:18 AM
It seems like the RAID part of many of these tests is just adding needless complexity. What is really being tested is the behavior of SSDs without TRIM. But that can be tested by simply turning TRIM off in the OS.

Perhaps true and the thread is now deviating, however as far as I know no one has looked at read performance degradation and how that relates to write patterns. As can be seen in post #47, trim restored read performance after a batch of 4K writes. In raid (no trim) filling the disk with large sequential xfers on a freshly SE’d array resulted in a 47% drop in read speeds.

Ao1
11-10-2011, 01:51 AM
OK, 12+ hours and not a stitch of movement.

http://img442.imageshack.us/img442/8876/unledax.png (http://imageshack.us/photo/my-images/442/unledax.png/)

4MB blocks.0 fill Delete files. Intriguing.

http://img155.imageshack.us/img155/4035/123ejw.png (http://imageshack.us/photo/my-images/155/123ejw.png/)


Write speeds are not fully restored however. Avg to fill array with 4MB blocks (non compressible) is 90MB/s

Anvil
11-10-2011, 02:05 AM
Interesting but otoh one would expect read spead to increase as the data is no longer compressed.

some_one
11-10-2011, 02:06 AM
SF can be hard to read as there is so much variance between incompressible and highly compressible. Now that you have filled with 0's if you leave it for a while GC should be able to make use of all those media pages that have been freed. If you try filling it again later the write speed should be quicker.

Ao1
11-10-2011, 02:18 AM
Interesting but otoh one would expect read spead to increase as the data is no longer compressed.

I deleted the 4MB 0 fill files that ASU generated before I ran HD Tune. Do you know how HD Tune undertakes the read benchmark? I'm running a write benchmark now and it looks like HD Tune uses highly compressible data for writes.

Anvil
11-10-2011, 02:21 AM
Thanks, that makes sense now.
...

HDTune for READ testing has always been useful, it's the only way I've found to visually detect changes while experimenting.


It seems like the RAID part of many of these tests is just adding needless complexity. What is really being tested is the behavior of SSDs without TRIM. But that can be tested by simply turning TRIM off in the OS.

I agree, having said that one can't assume that the raid-driver operates like the none raid one and the granularity of files adds a new dimension to the reads.
I'm quite sure we would have seen other artifacts if the tests were done on a different platform as write combining/caching is handled differently.
It would be interesting to see if WBC On/Off makes any difference for the result. (except for write speed)

Anvil
11-10-2011, 02:27 AM
I deleted the 4MB 0 fill files that ASU generated before I ran HD Tune. Do you know how HD Tune undertakes the read benchmark? I'm running a write benchmark now and it looks like HD Tune uses highly compressible data for writes.

Even if you deleted the file, the data referenced by the LBA's were previously holding easily compressible data, it looks like the SF drives/metadata somehow reflects this.

You should check E9/F1 before and after a HDTune write test, I have never tried it for writes except for HDD testing.
I would expect HDTune to write either random data (essentially garbage from memory) or highly compressible data.

Ao1
11-10-2011, 02:42 AM
That was my conclusion. O fill would not have written to all of the nand yet the read speeds were restored across the array, so it would appear that the metadata records are slowing things down.

Ao1
11-10-2011, 02:51 AM
I know it is a Frankenstein RAID, but are your 4k reads (with OP) faster than the 4k reads on the single V3?

I checked results with OP, no OP in R0 and compared to a single drive. As expected in a fresh/ steady state 4K results (QD1) don't vary by much.

johnw
11-10-2011, 09:59 AM
as far as I know no one has looked at read performance degradation and how that relates to write patterns.

Yes, I agree, you guys are doing some interesting experiments. I was just pointing out that, IMO, the most interesting data relates to the behavior of SSDs without TRIM, which could be tested similarly to what you are doing here but on a single SSD (no RAID) with TRIM turned off in the OS. The RAID component of the test may also be interesting, but it is kind of running before learning to walk. It is usually better to start experiments with less complexity, and then once you understand everything in the simpler situation (single SSD without TRIM), then add on more complexity (RAID).

sergiu
11-12-2011, 02:26 PM
I have watched all HDTune screenshots with a grain of salt, as results seems to be quite odd. I have run the benchmark on a Vertex 2 and 3 (both OS drives) and results were quite interesting:
Vertex 2 120GB (@SATA1, no TRIM, no OP): access time = 0.220 ms average speed: 102MB/s
Vertex 3 120GB (@SATA2, TRIM, no OP): access time = 0.362 ms average speed: 137MB/s

I took a look at how HDTune does the requests using diskmon and I found that is doing the read test using sequential 64KB blocks, while access time is done using single 512byte sectors, with unaligned addresses. Similar read speeds were easily reproduced with ASU by setting random read block to 64KB and thread number to 1. Given the conditions, non Sandforce drives might have similar results.
From my point of view, HDTune tests are not relevant as for sequential access, block size used by real world applications is in MB range. 64KB means 8 pages. Depending on how these were written, it might be spread in 1 to 8 dies, so the controller could or could not reach all of them concurrently. This explains why after trim you see max speed, as there is nothing to read, you just get automatic generated zeroes.

Anvil
11-12-2011, 04:35 PM
You need to change the block size in HDTune, 64KB is default.

I never paid much attention to the other "benchmarks" in HDTune as they were made for HDD's and not SSD's.
(it still works pretty well for HDD's)

sergiu
11-12-2011, 06:00 PM
Not even noticed it has an option for block size. I took some tests at various block size and I believe is seriously flawed. Here are the first read commands captured from diskmon:
Sector - Length
0 1
1 1024
1025 1024
2049 1024

During the benchmark, from time to time, it is reading just one sector, and sector address is shifted accordingly. Interesting part is that I really needed the maximum selectable block size (8MB) to achieve most stable result.

johnw
11-12-2011, 06:20 PM
Not even noticed it has an option for block size. I took some tests at various block size and I believe is seriously flawed. Here are the first read commands captured from diskmon:
Sector - Length
0 1
1 1024
1025 1024
2049 1024

During the benchmark, from time to time, it is reading just one sector, and sector address is shifted accordingly. Interesting part is that I really needed the maximum selectable block size (8MB) to achieve most stable result.

Is it actually reading every LBA on the device? I had thought it would read a chunk, then skip a large chunk then read, skip, etc. Because reading every LBA on, say a 2TB HDD, would take a LONG time.

sergiu
11-12-2011, 06:37 PM
Normally it uses chunks. I selected full drive to see what is the first LBA read. However, both chunk/full drive result in shifted LBAs.

johnw
11-12-2011, 08:16 PM
Normally it uses chunks. I selected full drive to see what is the first LBA read. However, both chunk/full drive result in shifted LBAs.

I'm not sure 4KiB alignment makes a big difference when reading large chunks. If it reads 1024 consecutive 512B sectors, then if aligned on a 4KiB boundary the SSD would need to read 128 pages (4KiB each). If it is not aligned, then the SSD has to read 129 pages (one additional page). That is only 0.8% overhead.

some_one
11-12-2011, 11:48 PM
That's a very astute observation sergiu, but I'd be inclined to agree with johnw here. Perhaps though we're looking at it too simply. What happens when you request a sector of 512bytes, does windows read a full cluster?

Ao1
11-13-2011, 01:46 AM
With HD Tune you can set the block size in the options tab. You can also set a partial or full test. I have used the 4MB block size throughout.

http://img12.imageshack.us/img12/8012/34185068.png (http://imageshack.us/photo/my-images/12/34185068.png/)

Currently my V2 is set as a single drive. I used ASU to run a read benchmark with a 1GB test file on the V2. Read speed = 143.46MB/s. I then ran a HD Tune benchmark. Avg read speed = 262.4MB/s, however the read speed at the location of the ASU test file came out at 135.6MB/s. :up:

http://img337.imageshack.us/img337/2402/76839845.png (http://imageshack.us/photo/my-images/337/76839845.png/)

Here I run ASU with a 32GB test file (4MB blocks) Read speed = 142.09MB/s. I then ran a HD Tune benchmark. Avg read speed = 155.5MB/s, the read speed at the location of the ASU test file came out at 134.3MB/s.

http://img827.imageshack.us/img827/1097/20580691.png (http://imageshack.us/photo/my-images/827/20580691.png/)

Ao1
11-13-2011, 02:05 AM
That's a very astute observation sergiu, but I'd be inclined to agree with johnw here. Perhaps though we're looking at it too simply. What happens when you request a sector of 512bytes, does windows read a full cluster?

For reads a page is the minimum amount that can be read. For an erase operation it is a block, so a read of 512b will equal a 4K or 8K read depending on nand geometry.

Ao1
11-13-2011, 02:10 AM
Same run now for the V3. Read speeds are not impacted in the same way as the V2.

http://img338.imageshack.us/img338/9540/77666445.png (http://imageshack.us/photo/my-images/338/77666445.png/)

Ao1
11-13-2011, 02:30 AM
V3. 4MB block read speeds after 6 minutes/ 10GB of 4K random writes. It looks like the V3 is much more resilient with 4MB sequential xfers, but 4K xfers still have a big impact on read speeds.

Min read speed = 53.8MB/s
Read speed at location of 4K write = ~150MB/s
Max read speed 253.2MB/s

http://img215.imageshack.us/img215/7171/71883335.png (http://imageshack.us/photo/my-images/215/71883335.png/)

For client work loads, which cannot be compressed that well by the SF controller, read/ write performance sucks.

sergiu
11-13-2011, 03:08 AM
I'm not sure 4KiB alignment makes a big difference when reading large chunks. If it reads 1024 consecutive 512B sectors, then if aligned on a 4KiB boundary the SSD would need to read 128 pages (4KiB each). If it is not aligned, then the SSD has to read 129 pages (one additional page). That is only 0.8% overhead.

For large blocks (>256K) indeed does not matter, but for small one, like the default 64K value, it counts for sure.

@Ao1
Filling the drive with random 4K data will force the drive to optimize writes in such a way that reads would be better also random. Have you tried to fill the drive with movies and then read it with HDTune (4MB blocks) ? This should give you the maximum read speed that drive can sustain for it's configuration. And for a small drive, like 60GB models, it would make sense to have smaller values. The question is if this is significantly lower than what competition can do with same NAND.

Ao1
11-13-2011, 03:11 AM
V2. Drive SE’d and then filled with 4MB blocks using ASU. Run read with HD Tune:

http://img510.imageshack.us/img510/1503/51167666.png (http://imageshack.us/photo/my-images/510/51167666.png/)

Delete test file. Run read with HD Tune. It can be seen that the TRIM command executed just after the read operation started. As soon it was complete read speeds were restored. In comparison to the R0 array reads speeds did not restore, even after 4MB blocks were written several times.

http://img204.imageshack.us/img204/7373/24772528.png (http://imageshack.us/photo/my-images/204/24772528.png/)

It would be good to see how non SF drives performed in comparison. I can only conclude that running SF drives in R0 is not a good move.

Ao1
11-13-2011, 03:15 AM
For large blocks (>256K) indeed does not matter, but for small one, like the default 64K value, it counts for sure.

@Ao1
Filling the drive with random 4K data will force the drive to optimize writes in such a way that reads would be better also random. Have you tried to fill the drive with movies and then read it with HDTune (4MB blocks) ? This should give you the maximum read speed that drive can sustain for it's configuration. And for a small drive, like 60GB models, it would make sense to have smaller values. The question is if this is significantly lower than what competition can do with same NAND.

Let me test with real data. I will use a mixture of data to try and get a balance between compressible and non-compressible.

Anvil
11-13-2011, 03:19 AM
I've spent some time testing the array of Intel G1's I use for booting on one of the X58's

I was surprised at the outcome, will post a little later.

Ao1
11-13-2011, 03:43 AM
V2 Drive SE’d.

Copy Windows folder over = 19.8GB

It looks like the Windows folder can be compressed by around 50%.

http://img220.imageshack.us/img220/2464/39524745.png (http://imageshack.us/photo/my-images/220/39524745.png/)

Ao1
11-13-2011, 03:50 AM
Add My Videos folder = 11.1GB

It looks like the Videos folder could not be compressed.

http://img59.imageshack.us/img59/8164/85685747.png (http://imageshack.us/photo/my-images/59/85685747.png/)

Ao1
11-13-2011, 03:59 AM
Add My Documents folder = 4.91GB

It doesn’t look like it could be compressed by much.

http://img72.imageshack.us/img72/5716/90829982.png (http://imageshack.us/photo/my-images/72/90829982.png/)

Ao1
11-13-2011, 04:07 AM
For comparison this is from an X25-MG 2 160GB drive. It’s the same drive that I used to copy over the My Documents and My Videos folders. The drive has 50GB free of a formatted space of 149GB. This drive contains all my static data, which for the most part is not compressible. This drive has not been SE’d for a very long time.

http://img97.imageshack.us/img97/5218/x25m.png (http://imageshack.us/photo/my-images/97/x25m.png/)

sergiu
11-13-2011, 04:10 AM
Well... mistery solved: 130-140MB/s read speed for incompressible data for 60GB model. This is all you can get from drive, nomatter if it is in RAID or not. Now we only need to find out the values for other models (120, 240 and 480GB), both V2 and V3.

Ao1
11-13-2011, 04:13 AM
And here is my other X25-M 160GB that I use for the OS. This drive was SE’d around a month ago. 95.4GB free of 148GB

http://img254.imageshack.us/img254/5512/x25mcdrive.png (http://imageshack.us/photo/my-images/254/x25mcdrive.png/)

Ao1
11-13-2011, 04:13 AM
Double post

Anvil
11-13-2011, 04:15 AM
This is before and after a Fill drive (using 46%) on the F3 used in the Endurance test

(before)
http://www.ssdaddict.com/ss/Force3/13-November-2011_01-48.png

(filled drive)
http://www.ssdaddict.com/ss/Force3/Corsair Force 3 SSD_120GB_1GB-20111113-0157.png

(after deleting file, performance restored)
http://www.ssdaddict.com/ss/Force3/13-November-2011_02-08.png

The degrading performance seen here is due to the random writes which are part of the loop.
All random IO is performed on a single file and deleting that file does not clean up the drive, this type of IO is hard to clean.

edit:
This kind of degrading performance can be seen on other drives as well, e.g. the Intel controller.

Ao1
11-13-2011, 04:24 AM
It would be great to see a wider range of results; however from post # 118 the V2 came up with an Avg read speed of 192.7MB/s. The same (real life) data in post #119 on the X25-M returned an Avg read speed of 251.5MB/s.
The X25-M > V2hands down when it comes to read speeds.

Ao1
11-13-2011, 04:57 AM
As can be seen in post #92, 4MB block writes using 0 fill does the same job as a TRIM operation with regards to restoring read performance. :up:

As could be seen from the compression tests (in other threads), SF compression is not that good for typical client workloads. The compression algorithm was most likely designed and optimised for enterprise workloads. Bottom line, as far as I can see, is that client compression is nothing like 50%. On average I’d guess it would be single digit compression at best.

The Windows folder was compressed by 50% but even here, when the SF drive was written to only once straight after a SE the read speeds fluctuated wildly between 140MB/s & 260MB/s. I’d guess the fluctuation was due the fact that not all of the files within the folder could be compressed by the same amount.

Ao1
11-13-2011, 04:58 AM
Double post

sergiu
11-13-2011, 05:47 AM
Some tests on what I have access:

V3 120GB @ SATA3 - light server: Average 501MB/s, minimum 431MB/s
V3 240GB @ SATA3 - light server: Average 491MB/s, minimum 438MB/s
V3 120GB @ SATA2 - desktop, with 17GB of static incompressible data: Average 262MB/s, minimum 237MB/s

From my tests, I would say V3 120GB models can sustain at least 262MB/s read speed with incompressible data, but no more than 450MB/s. 240GB model had a much stable output around the average speed so it might sustain more.

http://www.freeimagehosting.net/427ca

Anvil
11-13-2011, 06:03 AM
That depends on what NAND is used, async is noticeably slower than synchronous.

The Force 3 120GB (async) used in the Endurance test drops to ~190MB/s using incompressible data (e.g. MP3)

some_one
11-13-2011, 07:24 AM
For reads a page is the minimum amount that can be read. For an erase operation it is a block, so a read of 512b will equal a 4K or 8K read depending on nand geometry.The file system on the OS has a cluster size. For instance if the cluster size were 64k then writing one byte writes 64k to the disk. Does this mean it would be best to keep cluster size the same as page size?



For large blocks (>256K) indeed does not matter, but for small one, like the default 64K value, it counts for sure.
Let's have a look.

Unaligned reads @64k block size.
http://img580.imageshack.us/img580/3381/hd64kunaligned.png
http://img585.imageshack.us/img585/839/dm64kunaligned.png

Aligned reads @64k block size.
http://img59.imageshack.us/img59/7333/hd64kaligned.png
http://img217.imageshack.us/img217/3392/dm64kaligned.png


The surprising one for me though is 4k block size.

Unaligned
http://img847.imageshack.us/img847/3391/hd4kunaligneda.png

Aligned
http://img256.imageshack.us/img256/5875/hd4kaligned.png

What happened to my 900MB/s read speed?

Ao1
11-13-2011, 07:36 AM
The nice thing with HD Tune is that you only have to run the read benchmark to find out the compressibility of the data on your hard drive at any given time. For a single drive (with TRIM) you can also see how much data on the drive has been compressed. For example in post #118 I placed 35.81GB of data on the drive. The drive benchmark however is showing that only 26GB of NAND was used.

Anyone with a SF drive that has been using it for their normal activities only has to run the read benchmark with the 4MB block setting to see how much their data has been compressed and how fast their read speeds are likely to be in real life applications.

With regards to aligned vs nonaligned in real life nonaligned writes/ reads occur all the time. Obviously the OS will try to make them aligned, but that does not mean it happens every time.

With regards to larger drives I suspect that once everything has been written to at least once the results would be the same.

sergiu
11-13-2011, 09:22 AM
What happened to my 900MB/s read speed?
Maybe you hit some request limit.
How have you made HDTune requests to be aligned for benchmark section? Seems I'm not lucky in finding the option...

@Ao1
Indeed, I believe HDTune is a good indicator on how effective is compression

Anvil
11-13-2011, 11:41 AM
900MB/s = 230,000 iops, that would be rather amazing at 4KB block size using just a few drives.

I expect 900MB/s would be possible at 16-32KB blocksize.

some_one
11-13-2011, 10:14 PM
How have you made HDTune requests to be aligned for benchmark section? Seems I'm not lucky in finding the option...

AFAIK that option doesn't exist, you would have to ask the author to provide it.

You're right Anvil, 16k gets me back to 900MB/s, 8k ~650MB/s but of course the drive isn't as fast as that when real data has to be retrieved.

Ao1
11-14-2011, 12:36 AM
http://img6.imageshack.us/img6/4522/unledaic.png (http://imageshack.us/photo/my-images/6/unledaic.png/)

Look for the Extra Tests tab

Tekdemon
04-21-2012, 12:20 AM
Just an FYI in case anybody else was hunting for information, it appears that G. Skill also applied the throttling to their SF1200 drives much like OCZ, it's mentioned in their firmware update releases (for SMART monitoring of throttling) and I'm pretty sure I just triggered it by re-building (and re-imaging over) a RAID array. Write speeds went right to hell, right now my 4-drive raid 0 is slower than a single drive in a lot of tests even though it's 4k aligned right.
*sigh* I think I'm done with dealing with this sandforce silliness.

canthearu
04-21-2012, 12:50 AM
Huh, shouldn't your drives have thousands of hours on them now given how old they are.

Writing a couple of TiB during imaging at that stage should not have triggered any form of lifetime throttling. Your problem is probably elsewhere.