-
SSD roundup: Vertex 3 vs M4 vs C300 vs 510 vs 320 vs x25-M vs F120 vs Falcon II
-
Awesome - good golly the C300's are quick!!!
-
Quote:
Originally Posted by
Brahmzy
Awesome - good golly the C300's are quick!!!
Yes are quick, I bought mine 128Gb for Amd sb850, 6 months ago :D
But Crucial M4 128Gb, at same price, will became a better buy: little lower 4k random read, but higher 4k random write :up:
-
yes but look at M4 im tellin ya it looks like a 'sleeper'.
everyone is looking to the vertex 3, but in incompressible the M4 owns it. in compressible the M4 is right there. so you make a smart judgement and overall M4 should be faster.
i personally find it very strange that the M4 beats the vertex 3 in all of the 'old' anandtech profiles, so then he creates new profiles, in which the V3 wins. strange that.
EDIT: look at 4kqd1. the M4 smokes with 26 compared to V3 @ 17.
-
Quite strange that... the 128GB M4 is faster than the 256GB M4 in random reads. Results switched?
As far as Anand, eh... if this were a first... anyone remember when they switched VMmark for vApus benchmark to make Barcelona not look bad?:) (and thus showing Netburst was good really :P)
-
look at this
http://www.maximumpc.com/article/fea...iewed?page=0,0
Quote:
PCMark Vantage, which mirrors real-world applications, actually gives the top spot to Crucial’s m4 SSD, with the Vertex 3 a close second.
unfortunately those 'bright minds' at maximum pc gave the V3 the overall win based on its max sequential LOL. oh yeah and its QD32 random performance. lol. in the qd32 though the M4 was right there.
there was a bug with compatibility and the M4 with certain controllers when the first round of reviews came out. now that bug has been addressed apparently, because in all current testing that i have seen the M4 is beating the V3 handily. ESPECIALLY with incompressible data. isnt even close there. but in real world stuff as well (ex. pcmv traces) it seems to be much better than initial reports......
you take into account the history of the reliability of a crucial controller vs the history of reliability in a SandForce controller and it becomes a no-brainer according to the latest data ive seen coming out....
this is exactly why i have been waiting around on the 25nm 'generation'. we need to let all of these drives out in the wild for a bit to mature, then we should see a clear winner emerge.
-
Quote:
Originally Posted by
alfaunits
Quite strange that... the 128GB M4 is faster than the 256GB M4 in random reads. Results switched?
I think no.
This is explanation to hardware.fr (google translated :p:):
"The 128 GB version also gets results well above the 256 GB version of the explanation is actually quite simple, it is at the level of memory chips used. On the 128 GB, the chips used are organized with pages - the smallest readable - 4 KB, while the 256 GB version, the pages are 8 KB, therefore, to read 4 KB requested at the test, the 256 GB version should actually go to a full page of 8 KB"
-
Quote:
Originally Posted by
hardware.fr
"The 128 GB version also gets results well above the 256 GB version of the explanation is actually quite simple, it is at the level of memory chips used. On the 128 GB, the chips used are organized with pages - the smallest readable - 4 KB, while the 256 GB version, the pages are 8 KB, therefore, to read 4 KB requested at the test, the 256 GB version should actually go to a full page of 8 KB"
It could be a bad translation, but I don't think this explanation is correct. As far as I know, all 25nm IMFT flash uses 8KB pages, and all the m4 SSDs use 25nm IMFT flash. Therefore, there should not be a difference in page size for 128GB and 256GB Crucial m4 SSDs.
-
Quote:
Originally Posted by
johnw
It could be a bad translation
It's not, that's exactly what is said in the article.
Translated from french by myself (sorry for the mistakes) :
The Crucial M4 128 GB we tested combine a Marvell 88SS9174-BL02 controller, a Micron DRAM chip (on its back) and 16 Micron 29F64G08CFACB flash chips. These chips are 25nm and combine two 32 Gb dies. The page size is 4 KB, and the bloc size is 1 MB.
The Crucial M4 256 GB distinguish itself from the 128 GB version by the Flash chips, which are 29F128G08CFAAB. Their capacity is doubled from the use of two 64 Gb dies. This time the page size is 8 KB and the bloc size is 2 MB.
-
Thanks for the translation. They seem quite specific that the Micron 29F64G08CFACB flash chips use 4KB pages and are 25nm-lithography chips. That is certainly interesting if it is true. Everything I had read previously had implied that all 25nm IMFT flash would use 8KB pages.
-
1 Attachment(s)
I like this review. Shame it is in French as the translation does not come out too well. Here is a chart using data from the benchmarks, but focusing only on QD 1, 2 & 4. Results are sorted from QD1. Lowest to the left, highest to the right. The C300 dominates as expected. Interesting to see that the Intel 310 performs consistently within this zone regardless of drive size.
Seems like the Vertex 3 takes a performance hit with sequential writes after 30 seconds. The Intel 510 is the clear winner here.
Reality though is that in the practical tests there is no real difference between any of the SSD's. Even the Ramdisk shows little improvement when considering how much faster it is compared to an SSD.
-
Quote:
Originally Posted by
Khoral
It's not, that's exactly what is said in the article.
Translated from french by myself (sorry for the mistakes) :
The Crucial M4 128 GB we tested combine a Marvell 88SS9174-BL02 controller, a Micron DRAM chip (on its back) and 16 Micron 29F64G08CFACB flash chips. These chips are 25nm and combine two 32 Gb dies. The page size is 4 KB, and the bloc size is 1 MB.
The Crucial M4 256 GB distinguish itself from the 128 GB version by the Flash chips, which are 29F128G08CFAAB. Their capacity is doubled from the use of two 64 Gb dies. This time the page size is 8 KB and the bloc size is 2 MB.
I too am intrigued by this. Could you ask the reviewer to confirm?
-
2 Attachment(s)
4K Random Read is the most important thing for 97% of users.
So I would recommend the Vertex 3 only if it is the 240GB Version. If not then The M4 C400 128GB would be my best bet.
-
1 Attachment(s)
I agree about the 4k, but primarily only at QD1. Highly compressible, aligned 4K stats at QD32 are completely worthless to a desktop user.
At QD1 the Vertex 3 sucks. It's the worst performing SSD according to that review.
EDIT:
Here is a summary showing the percent increase of 4K RR performance at QD1. A bit surprising that the Vertex 3 120 is coming out faster than the 240 version. The G Skill results are also a bit surprising.
The C300 is still the fastest at 4K reads from QD 1 to 4 and it stands its ground with 4K writes as well.
-
^^ amazing that the c300s look this good against the "newer" competition. :shrug:
Buy.com has the c300/128 for $220 + shipping, the c300/64 can be had for ~120. :)
http://www.buy.com/prod/crucial-real...214560204.html
-
The 25nm drives delivery higher sequential performance, but at the expense of QD 1 4K random read performance. You need to go above QD3 before you see a benefit. If that is progress or not I guess depends on your specific requirements.
From what I can see with hIOmon sequential xfers typically consists of multiple small xfers. This can be demonstrated quite well below.
General OS
Largest single read I/O xfer size = 1.14MiB (1 occurrence)
Black OPs single player
Largest single read I/O xfer size = 28.24 MiB (1 occurrence)
Black Ops MP
Largest single I/O xfer size = 0.88 MiB (1 occurrence)
The largest xfer monitored came out at 28.MiB. It does not take long to read that if you can read at 100MB/s or 500MB/s. Copying large Photoshop files also results in multiple xfers, most of which are 1MiB. (None larger, but a few much smaller ones as well).
What takes time is sequential xfers that are constructed from multiple small xfer sizes. We are talking kB not MiB.
-
Quote:
Originally Posted by
Ao1
Reality though is that in the practical tests there is no real difference between any of the SSD's. Even the Ramdisk shows little improvement when considering how much faster it is compared to an SSD.
This is so true. I've had a TON of different SSDs in a ton of different systems and honestly, especially on an overclocked system, you REALLY have to be paying attention to see much of difference on normal day to day usage.
I've got a 60GB Vertex 1 and it actually boots far faster than 2 V3's (120's OR 240's) - like 3-4 seconds faster. Reason is it doesn't have a creative soundcard in the box. But boot times mean nothing to me, some folks it does, but any SSD is going to boot about the same within a second or two of each other.
As far as general browsing and light usage, in W7 x64, there's ZERO difference between my SSDs. ONLY when I'm doing heavy, heavy multitasking, application installs or large amounts of file copying do I see a difference - and that is maybe 2% of time spent. I can slow my X-25M G2 160 down to a crawl at work with about 40 things open at any given time. That's where I'd love to get a faster SSD. Gaming, yes, levels do load faster in R0 with fast SSDs - tested this a thousand times.
I think we've gotten to the point where unless you're a true 24/7 power user that's got applications or processes that hammer your storage up and down - only then will the fast drives really shine.
Quote:
Originally Posted by
Metroid
4K Random Read is the most important thing for 97% of users.
This is what we all agree on, yet do we see any REAL WORLD difference QD1 4K random reads between any one of those drives in this review make a difference in what we experience?
Yes, on paper and HiMon tell us what goes on in the background, but honestly, Windows does such a good job never exposing any of that to the end user - It has been optimized for spinning disks.
I mean if somebody could convince me that 2 C300 256GB's in R0 would out-perform my 2 V3 240GB's in overall snap/real world experience, I'd buy two, test 'em and sell whatever's the slowest pair. Vice Versa. I am vendor agnostic. I honestly think the collection of various benchmark tools we have along with what the reviewers show, still don't tell the whole story. We need a non-subjective "user experience" benchmark. I think Anand tries to do that in his light/heavy workload tests, but again I don't know if those results reflect real-world usage.
As far as Anand's 2011 new tests somehow favoring the V3's - I don't think that's on purpose - some say Anand has a preference for the intel drives, if any.
The M4 128GB drives do have me very interested - especially if what they're saying - 4k pages on the smaller NAND, is true.
-
^^ for competitive pcm05 and pcmv it matters but yes i agree in every day use I couldn't tell a difference between most SSDs.
I can tell a difference between SSD vs acard - but thats flash vs DDR2 - probably not a valid compare.
Using two acard 9010s (4xR0) on P67 PCH is probably the hotest bootable set up but not really practical.
Edit - using acard as your os drive has other problems as well - huge cost and losing power = losing data.
-
^^ Mr. OneHertz - have you written a .bat to xfer files to your iodrive?
The thought was to write a little script to move all frequently accessed files to the iodrive (or ramdisk?) at start up?
Put the .bat in the startup folder?
An x58 board maxed out with DDR3 for a large triple channel ramdisk?
Ramdisk should be faster than either iodrive or acards I would think.
-
Quote:
Originally Posted by
Brahmzy
I mean if somebody could convince me that 2 C300 256GB's in R0 would out-perform my 2 V3 240GB's in overall snap/real world experience, I'd buy two, test 'em and sell whatever's the slowest pair. Vice Versa. I am vendor agnostic. I honestly think the collection of various benchmark tools we have along with what the reviewers show, still don't tell the whole story. We need a non-subjective "user experience" benchmark. I think Anand tries to do that in his light/heavy workload tests, again I don't know if those results show in real-world testing.
IIRC, Anand's 2011 real-world tests are a 'replay' of 1-2 weeks worth of normal use from him. The difference in disk-busy time between the top disk is a matter of seconds...spread out over the course of days in the real world. So yeah, if Anand's 2011 tests are fully indicative of real-world use, we have gotten to the point where SSDs are so fast that they're just not noticeably different any longer (in real usage).
-
I just can't understand how Anantech come up with an average QD of 6.09 for the typical workload, 3.59 for the heavy load and 7.76 for the gaming work load.
Replicating weeks of I/O activity and applying them at those QD's (and over a short duration) completely invalids the results for me in context of something that could demonstrate tangible real life performance benefit. I rarely see max QD's at those levels and avg is always 1 or a fraction above 1 for reads. Av QD writes QD's are a bit higher. Anything from 1 to 3.
Maybe he was using a floppy disk when he recorded his I/O activity. :shrug:
At least in this review the guy took the time to monitor real life applications, which show hardly any difference between SSD's, conflicting completly with what the graphs from Anandtech portray. I know which set of results I have more faith in.
-
I'm curious on what the 60GB V3 specs/performance are going to be like. I'm moving to Sandy Bridge in a week or so and will probably want to upgrade my 2x64GB Crucial M225 setup to SATAIII drive(s). Just not sure if I want to do 2x60-64GB drives in RAID 0 again or just a single 120-128GB drive. At least I've narrowed it done to Crucial (C300 or M4) or OCZ (V3)What are your guys thoughts?
-
Quote:
Originally Posted by
Khoral
It's not, that's exactly what is said in the article.
Translated from french by myself (sorry for the mistakes) :
The Crucial M4 128 GB we tested combine a Marvell 88SS9174-BL02 controller, a Micron DRAM chip (on its back) and 16 Micron 29F64G08CFACB flash chips. These chips are 25nm and combine two 32 Gb dies. The page size is 4 KB, and the bloc size is 1 MB.
The Crucial M4 256 GB distinguish itself from the 128 GB version by the Flash chips, which are 29F128G08CFAAB. Their capacity is doubled from the use of two 64 Gb dies. This time the page size is 8 KB and the bloc size is 2 MB.
Can you translate/ explain a bit more by what Marc is talking about when he describes benchmark traces being content with indentifying the type of access without considering content. (Page 6) :p:
I don't really understand what he is talking about or how big the advantage is. There is a delay whilst the data source is validated, which does not occur in a benchmark, or the trace time/ sequence allows compression to take place in the time saved by not validating (with a SF drive) ? :shrug:
-
Quote:
Originally Posted by
Ao1
What takes time is sequential xfers that are constructed from multiple small xfer sizes. We are talking kB not MiB.
Even if they are at 64KB (which is a typical pre-Vista cache manager buffer size), it won't change too much. A typical copy can use async I/O, which would essentially turn the 64KB sequential transfers into a QD>1 "randoms".
Well, on an SSD, there is really no sequential after some point, since the NAND gets rearranged and the LBA sequential mapping is hardly ever physically sequential.
Non async 64KB "sequentials" would therefore be essentially random 64KB at the physical level. Or random 4KB with higher QD at physical level.
-
Quote:
Well, on an SSD, there is really no sequential after some point, since the NAND gets rearranged and the LBA sequential mapping is hardly ever physically sequential.
true, i have thought of this many times. Sprinkle in some NCQ at the OS level and you should have truly random I/O all the time when doing actual work. not benchmarking.
-
Quote:
Originally Posted by
Metroid
4K Random Read is the most important thing for 97% of users.
So I would recommend the Vertex 3 only if it is the 240GB Version. If not then The M4 C400 128GB would be my best bet.
I've been using fancycache for awhile, it reports that writes are the big-o, not reads, for xp x64 at least.
-
I am also waiting for results of the OCZ Vertex 3 Max Iops versions. I wonder how much faster they will be in real world performance. They should be coming out shortly.
-
Quote:
Originally Posted by
Kent10
I am also waiting for results of the OCZ Vertex 3 Max Iops versions. I wonder how much faster they will be in real world performance. They should be coming out shortly.
In real world there is no difference between Vertex 3 vanilla and Max. In benchmark there is a small difference ;)
-
Quote:
Originally Posted by
keiths
I've been using fancycache for awhile, it reports that writes are the big-o, not reads, for xp x64 at least.
Is that cache within the SSD?
I don't see any benefit using cache within an SSD, reason why I don't use cache within the SSD, but it might change in future. This reminds of compressible and incompressible data, Vertex 3 x M4 C400 all over again.
-
Quote:
Originally Posted by
alfaunits
Even if they are at 64KB (which is a typical pre-Vista cache manager buffer size), it won't change too much. A typical copy can use async I/O, which would essentially turn the 64KB sequential transfers into a QD>1 "randoms".
Well, on an SSD, there is really no sequential after some point, since the NAND gets rearranged and the LBA sequential mapping is hardly ever physically sequential.
Non async 64KB "sequentials" would therefore be essentially random 64KB at the physical level. Or random 4KB with higher QD at physical level.
Totally agree. What I try to explain is that something like AS SSD Benchmark uses multiple 16MiB xfers for the sequential speed test. Multiple xfers of that size are not common. I've only seen single occurrences of xfers in that region with Black Ops, compared to 39,023 occurrences of 4K read xfers, most of which were "sequentially" read over a short duration.
I believe this is why I have not been able to observe xfers speeds above 100MB/s when loading games. i.e. the read transfers consist of multiple small xfers well below 16MiB required to get speeds of 250MB/s that are achievable with AS SSD.
-
Quote:
Originally Posted by
Ao1
Can you translate/ explain a bit more by what Marc is talking about when he describes benchmark traces being content with indentifying the type of access without considering content. (Page 6) :p:
Sure :)
Quote:
Typiquement si nous avions enregistré les accès effectués par Photoshop dans le traitement par lot et utilisé un tel logiciel, nous aurions pu avoir des écarts allant du simple au double entre les SSD, alors qu’à l’usage il n y a pas de différence. Enfin, ces traces se contentent de répertorier le type d’accès sans prendre en compte leur contenu, ce qui peut avantager les contrôleurs SandForce qui sont alors mis dans un cas favorable si le logiciel génère des données compressibles alors que l’utilisation servant de trace se basait sur des données déjà compressées.
Note : When speaking about "traces", he means PC Mark Vantage traces.
Typically, had we record the access done by Photoshop in the batch processing and used such software (note by me : meaning PC Mark Vantage), we could have had huge gaps between SSDs, where it wouldn't be felt at use. Lastly, these traces just list the access type, without considering their content, which can favour SandForce controllers which are in a favourable case if the software generates compressible data whereas the tracing use based itself on already compressed data.
So, it's already complicated in french xD
I understand it like that : First, he says that with Photoshop, you can have performance gaps when benchmarking, where at use you would see no difference.
Then he says that PC Mark Vantage generates compressible or not data, randomly (not sure about that part, but that's how I understand it). So you can't know with certainty if a SandForce controller is good because it was fed with compressible data, or if it is really good all the time.
About the chips, what do you want me to ask exactly ?
-
Hi Khoral, thanks for helping out. :up: Marc seems to be very well informed. :) That review is one of the best I have read. The trace issue appears to be a subtlety that he has picked up on, but even when translated to English I'm struggling to understand it. It would be great if he could explain it in more detail.
With regards to the chips it's the issue of 25nm having 4kB page size in the Crucial M4. My understanding (or should I say assumption) was that all 25nm had 8kB page sizes.
-
Quote:
Originally Posted by
Khoral
About the chips, what do you want me to ask exactly ?
Maybe you could ask how they know the page size of the flash chips. Do they have a spec sheet from Micron that gives the page size for those flash chips? Is it possible they got the specs of the 32Gbit 34nm flash chips confused with the 32Gbit 25nm flash chips?
-
Ok, understood.
I mailed him, I'll keep you updated when I get some info :)
-
Quote:
Originally Posted by
Metroid
Is that cache within the SSD?
I don't see any benefit using cache within an SSD, reason why I don't use cache within the SSD, but it might change in future. This reminds of compressible and incompressible data, Vertex 3 x M4 C400 all over again.
You're sidetracked, the point is it logs activity which shows writes are much bigger than reads. As for your sidetrack, fancycache can cache writes, in doing so consolidates them, which results in fewer writes to the underlying device, an obvious benefit to SDD longevity. As for speed, unless your SDD does 4GB/s, it's also faster.
-
Quote:
Originally Posted by
johnw
Maybe you could ask how they know the page size of the flash chips. Do they have a spec sheet from Micron that gives the page size for those flash chips? Is it possible they got the specs of the 32Gbit 34nm flash chips confused with the 32Gbit 25nm flash chips?
Hi John,
It's intriguing the way performance is coming out at QD 1 with 4k random reads.
• C300 128 is 3% faster than the C300 256
• OCZ V3 120 is 5% faster than the V3 240
• M4 128 is 19.7% faster than the M4 256
At QD 2 and above the larger capacity models become faster. That makes me think it is not an anomaly in the testing, although maybe that could not be excluded when differences are less than 5%.
With the Intel 510 & 320 however there is no significant difference between the different capacity drives at QD1. Above QD 1 and the 120 version of the 510 is faster than the 250. With the 320 it is the other way around.....but here we are comparing different controllers
Quote:
Originally Posted by
Khoral
Ok, understood.
I mailed him, I'll keep you updated when I get some info :)
Cool thanks. Did you ask him about the trace issue as well?:p:
-
I didn't, as I don't think I get it enough to be able to ask a "correct" question xD
And he already answered me o/
So, he says that it's not only the 25nm that include 8 KB page size, but also the capacity of the chips. So you can have 4 KB page size with a 25 nm NAND flash chip.
And when I asked him about how he knows that, he told me that he was able to retrieve some datasheets telling so (as he didn't tell me which or how, I admit that I can't ask for more ^^)
-
Quote:
Originally Posted by
keiths
You're sidetracked, the point is it logs activity which shows writes are much bigger than reads. As for your sidetrack, fancycache can cache writes, in doing so consolidates them, which results in fewer writes to the underlying device, an obvious benefit to SDD longevity.
It seems you use programs that write more than read, not my or many others cases. I do not use cache within an SSD. So the longevity of my SSD will be many times greater than if I was using fancycache.
Quote:
As for speed, unless your SDD does 4GB/s, it's also faster.
It can't beat memory ram and that is my point.
-
Quote:
Originally Posted by
Metroid
It seems you use programs that write more than read, not my or many others cases. I do not use cache within an SSD. So the longevity of my SSD will be many times greater than if I was using fancycache.
The use in question is caching of the OS drive, not a specialized use. Write cache reduces the number of writes to the drive, SSDs have limited writes, ergo, yes, will extend the life of SSDs.
> It can't beat memory ram and that is my point.
The point I saw you making was there's no point to using cache with SSD. How fancycache(memory) being faster than SDD makes that point I don't get, or I missed another point you were making.
-
^^ A significant portion of I/O activity occurs within the windows file system without touching the storage device.
I think the objective of fancy cache is to better manage the Windows file system cache by using different algorithms to try and keep useful data in cache, thus limiting the requirements to re- read from the device to obtain dropped cache data.
Better cache management can in theory be very beneficial if the storage device is slow or if the cache management program can be specially tweaked for a particular application. I've tried fancy cache with SSD and it made no perceivable performance difference to me.
Windows does an excellent job at managing cache for general use. Of course there will always be situations when a more focused cache management system could do better for a particular task, but it might at the same time also slow things down elsewhere in the process.
SSD's are so fast that it negates the benefit that might be seen with HDD.
I'm surprised you are seeing more writes. Normal I would say it's a 10 to 1 (or less) ratio between reads and writes. Do you do a lot of tasks that write data?
Regarding write caching, don't forget that SSD's also combine writes to reduce wear. This happens within the SSD itself.
My 2c anyways. Maybe there is a better explaination on cache OS/ Cache programes work.
-
This could be of some use... ;)
http://translate.google.fr/translate...n&hl=&ie=UTF-8
Maybe that could explain how the V3 end at the top on Anand benches.
-
Back on topic :)
The Micron product sheet states: Page size x8: 4320 bytes (4096 + 224 bytes) This appears to be the same for:
MT29F32G08CBACA/ EACA/ FACA/ XACA/ ECCB & FACB
-
-
Quote:
Originally Posted by
keiths
The use in question is caching of the OS drive, not a specialized use. Write cache reduces the number of writes to the drive, SSDs have limited writes, ergo, yes, will extend the life of SSDs.
> It can't beat memory ram and that is my point.
The point I saw you making was there's no point to using cache with SSD. How fancycache(memory) being faster than SDD makes that point I don't get, or I missed another point you were making.
When I asked within an SSD I meant using the SSD as root cache for it, my point was not cache with an SSD and yes it was cache within an SSD, I have or had no idea that fancycache is a memory based cache program, reason I asked about it in the first place.
Well concerning the OS drive about writing more than reading, because if there is nothing much to read it will not do it. As soon as a game + some looped programs are opened then the reads will increase. I don't have time to test to confirm this but many around here can confirm this and I stand by my quote: "4K Random Read is the most important thing for 97% of users."
-
Quote:
Originally Posted by
Khoral
If you only read/ write across a limited span on the SSD you get very good results. Wear levelling can kick in. If you read/ write across the complete span of the SSD it will really struggle. Off the top of my head Intel give specs based on an 8GB span. They also now give data for performance across the full span and IOPs drop like a stone. 300 or 400 from recollection.
If you write lots of small files and then overwrite them I think it also slows things down when compared to over writing larger files.
Benchmarking SSD's is a nightmare.
-
;)
Quote:
Originally Posted by
Khoral
Aren't these 34nm ?
I'd guess yes, but I only got to see an extract from the product data sheet from Marc. That kind of info is confidential/ proprietary so its not possible for the public to get that kind of information from Microns web site. I did try ;)
-
Same here ^^ Couldn't find anything there.
Quote:
Benchmarking SSD's is a nightmare.
That's so true. But it makes me ask "Is benchmarking SSDs useful ?" (not in a "who has the biggest e-penis size" use, but more in a 24/7 use)
-
fancycache is beta, available as a download. Hitting just the cache, 4k random qd1 r/w is 655/555MB/s in crystalmark, which with a 1-2gb cache is 95-99% of the time.
-
Quote:
Originally Posted by
Khoral
Same here ^^ Couldn't find anything there.
That's so true. But it makes me ask "Is benchmarking SSDs useful ?" (not in a "who has the biggest e-penis size" use, but more in a 24/7 use)
That is why it is good that Marc ran real life applications and timed them. He is waiting for a mod to activate his account and then he will hopefully join us. :)
-
Quote:
Originally Posted by
Ao1
;)
I'd guess yes, but I only got to see an extract from the product data sheet from Marc. That kind of info is confidential/ proprietary so its not possible for the public to get that kind of information from Microns web site. I did try ;)
Hi ;)
in fact it's the datasheet for
MT29F32G08CBACA, MT29F64G08CEACA, MT29F64G08CFACA,
MT29F128G08CXACA, MT29F64G08CECCB, MT29F64G08CFACB
MT29F64G08CFACB is the one on M4 128 GB (at least mine)
MT29F32G08CBACA is the one on Vertex 2 25nm with 32 Gb die, for example here :
http://www.anandtech.com/show/4256/t...review-120gb/3
In the datasheet i could not find something about the process, but since the endurance is rated at 3000 PROGRAM/ERASE cycles, and since above SSD (M4 and Vertex 2 25nm) used 25nm NAND ... we can assume it's 25nm ;)
-
Quote:
Originally Posted by
Ao1
If you only read/ write across a limited span on the SSD you get very good results. Wear levelling can kick in. If you read/ write across the complete span of the SSD it will really struggle. Off the top of my head Intel give specs based on an 8GB span. They also now give data for performance across the full span and IOPs drop like a stone. 300 or 400 from recollection.
If you write lots of small files and then overwrite them I think it also slows things down when compared to over writing larger files.
Benchmarking SSD's is a nightmare.
1st ... sorry for my very bad english :eek:
The problem here is with Vertex 3 only, from what i've see yet. Perhaps SandForce SF-1200 to but i haven't tested.
Random Read 4KB on M4 256 GB (for example) is 58 MB/s on QD3 even if you test it on 1 GB file, 8 GB file (and so on).
Random Read 4KB on Vertex 3 240 GB is 72 MB /s on 1 GB file, 72 MB/s on 2 GB file, 52 MB/s on 4 GB file, 44 MB/s on 8 GB file and 44 MB on 16 GB file.
In this case the size of the file = the size on the NAND since i use incompressible data on IOMETER (random)
But if you use highly compressible data (not very realistic, but ...) , with a 8 GB file you get 74 MB/s. Why ? Because the 8 GB file is higly compressed by the SandForce, and perhaps take something like 2 GB of real nand or less.
You didn't test random access on 8 GB but on something like 2 GB.
If i test random access within 45GB file of highly compressible data for example i get 45 MB/s.
The fact is that if you compare random read of Vertex 3 vs M4 using a (very) small portion of the SSD, then Vertex 3 show better random read than M4. But on a bigger part of the SSD, the M4 is faster.
For me, performance on a very small part of a SSD is less important. Random read is important for heayy multitasking, and heavy multitasking could need to read data from a large part of the SSD.
-
Quote:
Originally Posted by
Ao1
That is why it is good that Marc ran real life applications and timed them. He is waiting for a mod to activate his account and then he will hopefully join us. :)
For a lot of thing SSD are ... CPU limited
For example if i go from i7-2600K to i7-2600K @ 4.5 GHz on Crysis 2 level loading
WD 2 To Black : 21.5s => 19.2s (-2.3s)
X25-M 120 : 18.1s => 15.5s (-2.6s)
Vertex 3 240 : 17.1s => 14.4s (-2.7s)
In heavy multitasking perhaps we can have some (small) difference but it's hard to find a realistic case that i can use to benchmark.
For example during the bench i've tried to combine the launch of 3ds/photoshop/word/excel (wich in fact is already heavy) with a file copy (read+write) limited at 5 MB/s, i don't get any difference at all.
At the end, more than CPU limited, i think modern SSD are now "user" limited :D
-
Quote:
Originally Posted by
Ao1
^^ A significant portion of I/O activity occurs within the windows file system without touching the storage device.
I think the objective of fancy cache is to better manage the Windows file system cache by using different algorithms to try and keep useful data in cache, thus limiting the requirements to re- read from the device to obtain dropped cache data.
Better cache management can in theory be very beneficial if the storage device is slow or if the cache management program can be specially tweaked for a particular application. I've tried fancy cache with SSD and it made no perceivable performance difference to me.
Windows does an excellent job at managing cache for general use. Of course there will always be situations when a more focused cache management system could do better for a particular task, but it might at the same time also slow things down elsewhere in the process.
SSD's are so fast that it negates the benefit that might be seen with HDD.
I'm surprised you are seeing more writes. Normal I would say it's a 10 to 1 (or less) ratio between reads and writes. Do you do a lot of tasks that write data?
Regarding write caching, don't forget that SSD's also combine writes to reduce wear. This happens within the SSD itself.
My 2c anyways. Maybe there is a better explaination on cache OS/ Cache programes work.
Again, it's not a special setup or usage pattern that's hitting writes harder, it's bog standard on both, regular xp x64 doing regular use, browsing, gaming, movies, text editing, etc. It doesn't take long to see the write heavy usage pattern of windows, install fancycache and keep the performance statistics window open. Fancycache is better than drive write combining, it can combine more because it can be set to delay writes longer and on a larger pool of data(my setup being ten minute write delay on a 2gb cache.)
>SSD's are so fast that it negates the benefit that might be seen with HDD.
There is a point of diminishing returns, fancycache gets you there without having to wait for the next gen SSDs or spending a few grand on a caching controller and SSD RAID setup or high priced PCIe SSD drive.
-
So I'm still a bit confused about the take-home message from this thread. When I discuss SSD purchases with my friends, what am I going to recommend them for the future? A vertex 3 120/240gb or a m4 128/256gb?
-
Quote:
Originally Posted by
antiacid
So I'm still a bit confused about the take-home message from this thread. When I discuss SSD purchases with my friends, what am I going to recommend them for the future? A vertex 3 120/240gb or a m4 128/256gb?
The cheaper ;)
-
So are there still issues running fancy cache with RAID0'd SSDs? Many folks talked about corruptiong their arrays? I haven't kept up with it since I read those stories.
I'd love to give it a shot, but don't want another hassle on my hands....
-
Welcome to the forum Marc
+1
The cheapest or the one that's available.
(m4's are hard to find)
-
Quote:
Originally Posted by
Brahmzy
So are there still issues running fancy cache with RAID0'd SSDs? Many folks talked about corruptiong their arrays? I haven't kept up with it since I read those stories.
I'd love to give it a shot, but don't want another hassle on my hands....
Not having a RAID SSD setup, I don't know.
-
Quote:
Originally Posted by
Marc HFR
For a lot of thing SSD are ... CPU limited
For example if i go from i7-2600K to i7-2600K @ 4.5 GHz
WD 2 To Black : 21.5s => 19.2s (-2.3s)
X25-M 120 : 18.1s => 15.5s (-2.6s)
Vertex 3 240 : 17.1s => 14.4s (-2.7s)
In heavy multitasking perhaps we can have some (small) difference but it's hard to find a realistic case that i can use to benchmark.
For example during the bench i've tried to combine the launch of 3ds/photoshop/word/excel (wich in fact is already heavy) with a file copy (read+write) limited at 5 MB/s, i don't get any difference at all.
At the end, more than CPU limited, i think modern SSD are now "user" limited :D
Hi Marc, welcome to XS. :welcome:
That does not surprise me. It's beyond my understanding to know how I/O processing works with either hardware of software; but both seem to like to wait for an I/O to finish before the next I/O is requested from the storage device. How quick that I/O is processed by the CPU/ RAM would therefore seem to have an impact.
Right now I'd say if you are using HDD an SSD would be the best upgrade, but if you already have an SSD a faster CPU would be the better upgrade choice. (Assuming you would have to choose one or the other :) )
Reliability is also a big issue for me and would be my number one choice, even over price.
-
Quote:
Originally Posted by
keiths
Again.......
Not sure if you missed it but F@32 created a thread about FancyCache:
http://www.xtremesystems.org/forums/...d.php?t=267879
-
Quote:
Originally Posted by
Ao1
The xs threads on fancycache are where I found out about fancycache.
-
Quote:
Originally Posted by
Brahmzy
So are there still issues running fancy cache with RAID0'd SSDs? Many folks talked about corruptiong their arrays? I haven't kept up with it since I read those stories.
I'd love to give it a shot, but don't want another hassle on my hands....
I've got no issues with FancyCache and my R0 array. I've been using it for months without so much as a hiccup.
--Matt
-
Quote:
Originally Posted by
Ao1
Reliability is also a big issue for me and would be my number one choice, even over price.
From the number i get from a big french etailer, for the reliability we got
Intel X25-M > Crucial C300 > OCZ Vertex 2
From 6 to 12 month of use, X25-M is at 0.3% return rate, vs 1% on C300 and 3.6% on Vertex 2.
Hope this help.
-
Quote:
Originally Posted by
Marc HFR
Random Read 4KB on M4 256 GB (for example) is 58 MB/s on QD3 even if you test it on 1 GB file, 8 GB file (and so on).
Random Read 4KB on Vertex 3 240 GB is 72 MB /s on 1 GB file, 72 MB/s on 2 GB file, 52 MB/s on 4 GB file, 44 MB/s on 8 GB file and 44 MB on 16 GB file.
Thanks for posting that observation. That might explain something that I have been wondering about since the first reviews of the V3 came out. With CDM, the 4KB QD1 random reads are around 35 MB/s, but with AS-SSD, they are only 20 MB/s. With your observation, this could be explained by AS-SSD performing random reads over a larger span than CDM.
Or maybe not. I think AS-SSD uses a 1GB test file, and CDM is often run with a 1GB test file, which would suggest that the spans are not the source of the discrepancy. I suppose AS-SSD might use full-span for random reads (ignoring its test file), since for reads, the test file is not necessary. But I don't know. Maybe this could be determined with hIOmon? Does it record the LBAs that are accessed?
-
I've been meaning to check that for a while. Will do it now.
-
Quote:
Originally Posted by
johnw
Maybe this could be determined with hIOmon? Does it record the LBAs that are accessed?
I think so
-
Quote:
Originally Posted by
johnw
Thanks for posting that observation. That might explain something that I have been wondering about since the first reviews of the V3 came out. With CDM, the 4KB QD1 random reads are around 35 MB/s, but with AS-SSD, they are only 20 MB/s. With your observation, this could be explained by AS-SSD performing random reads over a larger span than CDM.
Or maybe not. I think AS-SSD uses a 1GB test file, and CDM is often run with a 1GB test file, which would suggest that the spans are not the source of the discrepancy. I suppose AS-SSD might use full-span for random reads (ignoring its test file), since for reads, the test file is not necessary. But I don't know. Maybe this could be determined with hIOmon? Does it record the LBAs that are accessed?
Hi johnw,
The short answer to both of your questions is "Yes".
Along these lines, please see the hIOmon thread post #297 where I used the hIOmon "I/O Operation Trace" feature along with the hIOmon "Summary" metrics feature in analyzing only the "Access Time" option of AS SSD.
The hIOmon "I/O Operation Trace" feature option enables you to collect, display, and export an individual record for each and every I/O operation for those I/O operations specified to be monitored by hIOmon. This "I/O Operation Trace" record can include the starting address (basically LBA in the case of device I/O operations) of the data transfer and the length of the data transfer.
The hIOmon software also provides an "Access Range Summary" option, which enables you to create a CSV-based export file where each record/row within this export file contains the "Access Range Summary" summarized I/O operation performance metrics for a separate, distinct "Access Range" associated with a specific "physical" device. An "Access Range" is an address span to which read and/or write I/O operations have performed data transfers as observed by the hIOmon I/O Monitor. BTW, the "Access Range Summary" metrics are based upon the "I/O Operation Trace" metrics that have been collected by hIOmon.
But that's another story. :)
-
Quote:
Originally Posted by
overthere
The short answer to both of your questions is "Yes".
I'm not sure I follow. There is a wealth of information in your posts, so I might be missing (or misinterpreting) something.
Did your previous posts specify the span (range) of I/O accesses for AS-SSD 4KB random READ QD=1? If so, does it span the full SSD, part of the SSD, or only the 1GB test file?
Also, it looks like your posts #283 and #285 in the hIOmon thread are identical. Was there supposed to be a difference?
-
AS SSD 1.5.3784.37609 - 4K Test Only
1 - Test file - Write
Sectors - 98668848 to 100764976
2, 049 O.5 MB xfers. Total write = 1024.5MB
2nd - 4K Writes
Sectors - 98668848 to 100765984
175,923 4K xfers. Total write = 687.19MB
3rd - 4K Reads
Sectors - 98668848 to 100765952
42,609 4k xfers. Total read 166.43MB
The above is an approximation. There was a bit of activity outside of the sectors above but I excluded it as I suspect it was OS related.
Basically it looks like AS SSD conditions a small span of the drive with 0.5MB writes. It then reads and writes within the same sectors.
Now for CDM
-
Quote:
Originally Posted by
Ao1
AS SSD 1.5.3784.37609 - 4K Test Only
1 - Test file - Write
Sectors - 98668848 to 100764976
2, 049 O.5 MB xfers. Total write = 361.5MB
For me 2049 0.5 MB xfers = 1024,5 MB :confused:
-
Quote:
Originally Posted by
johnw
I'm not sure I follow. There is a wealth of information in your posts, so I might be missing (or misinterpreting) something.
Did your previous posts specify the span (range) of I/O accesses for AS-SSD 4KB random READ QD=1? If so, does it span the full SSD, part of the SSD, or only the 1GB test file?
Also, it looks like your posts #283 and #285 in the hIOmon thread are identical. Was there supposed to be a difference?
Sorry for the confusion.
The hIOmon thread post #283 briefly discusses only write I/O operations.
The subsequent hIOmon thread post #285 briefly mentions only read I/O operations and notes that AS SSD performed its read I/O operations to the physical device directly at the physical device level within the OS I/O stack.
The hIOmon thread post #297 provides an overall summary with some accompanying details. It emphasizes again that AS SSD performed its read I/O operations (which were random) directly to the physical device (in fact, hIOmon observed only two file read I/O operations, both of which had a data transfer length of zero for the "AS-SSD-TEST42\test.bin" test file).
It is important to note that all of the above posts deal only with the case where the AS SSD "Access Time" option alone is selected. The other AS SSD test run options (e.g., 4K) were not addressed/discussed within these posts (since at the time Ao1 was only concerned with the "Access Time" option, if I recall correctly).
In any case, my above posts did not make any mention of the span (range) of I/O accesses performed by AS-SSD. The I/O operation trace that I captured using hIOmon did include the addresses accessed, but I did not do any analysis of these addresses at that time.
-
CDM 2.2 - 4K Test Only Test 5 test size 50MB
1st - 1MB Writes
Sectors - 2790048 to 85187656
45 1MB xfers. Total write =45 MB
2nd - 4K Reads
Sectors 2788704 - 52015696 - (Sector order totally random, but read sectors were read again - completely randomly around 12 times).
168,095 4K xfers. Total read = 656 MB
3rd - 4K Write
Sectors 2788704 - 85189760 - (Sector order totally random, but written sectors were wrote to again - completely randomly around 3 or 4 times).
19,615 4K xfers. Total read = 76 MB
Again only a rough outline. Xfers outside of 1MB and 4K were removed. It was not that much data however.
It seems that sectors are also precondition but with 1MB writes, although the sectors are completely random and over a much wider span. At a guess the preconditioned sectors were the locations of 4K reads and writes.
Either way it is quite different to AS SSD
-
Quote:
Originally Posted by
Marc HFR
For me 2049 0.5 MB xfers = 1024,5 MB :confused:
:mad: Yep sorry. I re-ran it to make sure.
-
Quote:
Originally Posted by
Ao1
2nd - 4K Reads
Sectors 2788704 - 52015696 - (Sector order totally random, but read sectors were read again - completely randomly around 12 times).
168,095 4K xfers. Total read = 656 MB
Great data, thanks for doing that test!
52015696 - 2788704 = 49,226,992 sector span = 23.5 GB (assuming 512B sectors).
So, AS-SSD has a 1GB span, and CDM has a 23.5GB span, and yet CDM measures a HIGHER 4KB QD=1 random read value. Hmmmmm.
I do not understand your comment above about "read sectors were read again". Do you mean that of the 656 MB, only 656 / 12 MB is unique sectors, and each sector is read 12 times? Or something else?
-
For example (with CDM)
Sector 2788984 was read 8 times and written to twice, but on a completly random basis.
Sector 52005656 was read 18 times and written to once, again randomly.
Sector 33933672 was read 16 times with no writes, again randomly. Edit - no 4k writes. I did not check to see if the sector had been written to with the 1MB writes.
Edit: Also the sectors were banded. For example:
85181704 to 85189760
51198048 to 59935488
-
Quote:
Originally Posted by
Ao1
CDM 2.2 - 4K Test Only Test 5 test size 50MB
1st - 1MB Writes
Sectors - 2790048 to 85187656
45 1MB xfers. Total write =45 MB
2nd - 4K Reads
Sectors 2788704 - 52015696 - (Sector order totally random, but read sectors were read again - completely randomly around 12 times).
168,095 4K xfers. Total read = 656 MB
3rd - 4K Write
Sectors 2788704 - 85189760 - (Sector order totally random, but written sectors were wrote to again - completely randomly around 3 or 4 times).
19,615 4K xfers. Total read = 76 MB
Again only a rough outline. Xfers outside of 1MB and 4K were removed. It was not that much data however.
It seems that sectors are also precondition but with 1MB writes, although the sectors are completely random and over a much wider span. At a guess the preconditioned sectors were the locations of 4K reads and writes.
Either way it is quite different to AS SSD
This data seems very strange. CDM should first create a test file with 1 MB write then read and write in the LBA range of this file, isn't it ?
-
1 Attachment(s)
Here is a screen shot of the 1MB xfers as they occurred. I believe these sectors were then used for the 4k reads/ writes.
-
Quote:
Originally Posted by
Marc HFR
This data seems very strange. CDM should first create a test file with 1 MB write then read and write in the LBA range of this file, isn't it ?
That is what I would expect. Since CDM gives you the option of 0-fill or random data, I would expect that it would only read back the data that it wrote, otherwise, it would not be reading the type of data that the user requested.
-
Quote:
Originally Posted by
Ao1
Here is a screen shot of the 1MB xfers as they occurred. I believe these sectors were then used for the 4k reads/ writes.
What were your CDM settings? File size, number of repetitions?
-
Quote:
ust can't understand how Anantech come up with an average QD of 6.09 for the typical workload, 3.59 for the heavy load and 7.76 for the gaming work load.
Replicating weeks of I/O activity and applying them at those QD's (and over a short duration) completely invalids the results for me in context of something that could demonstrate tangible real life performance benefit. I rarely see max QD's at those levels and avg is always 1 or a fraction above 1 for reads. Av QD writes QD's are a bit higher. Anything from 1 to 3.
A01 this is a very helpful nugget. i do believe this is how he is doing it and i have been wondering for awhile about that. scratching my head....how in the world did he get these high QD's?
This makes sense, and if true, would invalidate all of his testing data, as it pertains to these traces he is running.
-
Regarding AnandTech load.
It's 2 week usage.
How many hours ? Don't know, perhaps 140 ?
The fastest SSD run the benchmark in 762 second, the slowest in 1671 second.
So the access requested by the user in ONE hour take 5.44 seconds on the fastest SSD and 11.9 seconds on the slowest.
Is this difference noticeable in practice ? Not sure.
-
Quote:
Originally Posted by
johnw
What were your CDM settings? File size, number of repetitions?
Test 5 test size 50MB. Also please note it was version 2.2. so it did not have the zero fill option. I should stress again that this was a quick and rough summary. I removed any xfers that were not default xfer sizes. I'd guess that is why CDM is coming out with 45MB, when it was likely to be 50MB. Some of the 4k transfers might also have been OS related.
Regardless though it shows a very clear difference in the way 4k is benchmarked. With AS SSD there are a significant amount of writes occurring over a short span before reads occur over the same span. With CDM the reads are a lot more random over a wider span but to sectors that had not been subject to intense previous writes.
I'd guess that is why AS SSD is quite consistent even on a fresh state SSD.
Why are reads faster with CDM? It must be something to do with the extent of writes before read, which seems a bit strange, but what else could it be? :shrug:
On a steady state SSD I'd say CDM results are closer to real life than AS SSD.
-
Quote:
Originally Posted by
Marc HFR
Regarding AnandTech load.
It's 2 week usage.
How many hours ? Don't know, perhaps 140 ?
The fastest SSD run the benchmark in 762 second, the slowest in 1671 second.
So the access requested by the user in ONE hour take 5.44 seconds on the fastest SSD and 11.9 seconds on the slowest.
Is this difference noticeable in practice ? Not sure.
Looking at the Anandtech test description for the heavy load they state "Average queue depth is 4.625 IOs, with 59% of operations taking place in an IO queue of 1". If 59% are QD1 the 41% balance must be set very high to get an overall average of 4.625. EDIT: Avg QD9 for 41% ?
They also state that the benchmark writes 106.32GB, which represents nearly two weeks of constant use. If I use Intel toolbox I get the stats below.
SSD-1
Power on hours = 951
Host writes = 1.13TB (1,184,890.88 MB)
Writes per hour = 1,245.94 MB.
Writes per 10 hour day = 12.16 GB
Two weeks (10 days) = ~ 120GB
SSD-2
Power on hours = 3,216
Host writes = 3.64TB(3,816,816.64 MB)
Writes per hour = 1,186.82 MB.
Writes per 10 hour day = 11.59 GB
Two weeks (10 days) = ~120GB
So, the amount of writes seem to correlate with what I have experienced. In the Vertex 3 120 review Anandtech show the X25-M 160 (my drive) being able to process two weeks' worth of writes in into 1068.6 seconds. (17.8minutes) Not bad :) It would be nice if I could squeeze 2 weeks of work into 17 minutes ;)
I don't think it tells you anything you would really notice over the fastest device that did it in 412.9seconds (6.88 minutes). In addition writing that much over a short duration may impact wear levelling, TRIM etc that might not occur if applied over two weeks.
-
Get rid of that old CDM and get 3.x.
I've spent some time with the source code for CDM and it looks "clean".
The testfile is prepared/created using 1MB writes, the same file is used over and over again during the test.
-
The 4K Test 5, Test size 50MB benchmark structure does not seem to be different between CDM 3.0.1x64 & CDM 2.2. Perhaps that changes with larger test sizes and the number of tests, but for test 5, 50MB a precursory look would say there is no difference between versions.
-
I don't know if that changed from 2.2 to 3.x, but why use old software :p:
Asynchronous IO (non-blocking) is used for the QD >1 tests.
-
Quote:
Originally Posted by
Anvil
I don't know if that changed from 2.2 to 3.x, but why use old software :p:
Version 3 has Adware:Win32/OpenCandy.
A. CrystalDiskMark - The modified BSD license
This installer uses the OpenCandy network to recommend other software you may find valuable during the installation of this software. OpenCandy collects NON-personally identifiable information about this installation and the recommendation process. Collection of this information ONLY occurs during this installation and the recommendation process; in accordance with OpenCandy's Privacy Policy, available at www.opencandy.com/privacy-policy
Version 2 does not ;)
-
Quote:
Originally Posted by
Vapor
IIRC, Anand's 2011 real-world tests are a 'replay' of 1-2 weeks worth of normal use from him. The difference in disk-busy time between the top disk is a matter of seconds...spread out over the course of days in the real world. So yeah, if Anand's 2011 tests are fully indicative of real-world use, we have gotten to the point where SSDs are so fast that they're just not noticeably different any longer (in real usage).
We've definately gotten there, the big turning point was (imo) the x25-m 80GB and the introduction of garbage collection.
Of course improvements can't be limited to Joe the average user, some might already be using these drives in the workplace / small business and there brand x vs brand y will make a difference. In this case all the drives have a place with their trade-offs.
The enterprise probably wonders why TMS-Ramsan's are so cheap and if they should have another on hand to keep just in case.
-
Quote:
Originally Posted by
Ao1
Test 5 test size 50MB. Also please note it was version 2.2. so it did not have the zero fill option. I should stress again that this was a quick and rough summary. I removed any xfers that were not default xfer sizes. I'd guess that is why CDM is coming out with 45MB, when it was likely to be 50MB. Some of the 4k transfers might also have been OS related.
If you get a chance to try it again, please try it with 1 repetition, and 1GB file size. I guess that would give results more similar to AS-SSD, and also may make more sense.
-
Again a rough and ready summary. Quite a bit different through in terms that 4k reads & writes were not redone on the same sector. (With a few exclusions). Also the re-write balance is different.
CDM 2.2 Test #1 1,000MB
1st - 1MB Writes
Sectors - 2300344 to 100088224
980 1MB xfers. Total write =980 MB
2nd - 4K Reads
Sectors 2298376 to 100090256
35,205 4K xfers. Total read = 137 MB
3rd - 4K Write
Sectors 2298360 to 100090240.
77,349 4K xfers. Total read = 302 MB
-
Quote:
Originally Posted by
Ao1
Again a rough and ready summary. Quite a bit different through in terms that 4k reads & writes were not redone on the same sector. (With a few exclusions). Also the re-write balance is different.
CDM 2.2 Test #1 1,000MB
1st - 1MB Writes
Sectors - 2300344 to 100088224
980 1MB xfers. Total write =980 MB
2nd - 4K Reads
Sectors 2298376 to 100090256
35,205 4K xfers. Total read = 137 MB
3rd - 4K Write
Sectors 2298360 to 100090240.
77,349 4K xfers. Total read = 302 MB
That is still weird, though. 4K Reads spans 97,791,880 sectors which comes to 46.6GB. I wonder if some of those reads are not really part of the CDM drive test.
-
There is that chance, but I tried to pick off 4K read/ writes that were clearly not part of the test and in completly different sectors. Still, there could still be inclusions I could not see. It's lot of data to try and sift through. On the other hand I've run it a few times now and it comes out around the same every time.
Anyway it seems that by increasing the test number you increase the amount of times that the same sectors are re-read/ wrote. With a Test # 9 - 50MB I get:
192,418 4k read xfers.
36,648 4k write xfers.
The test file remains at ~50MB with 1MB write xfers.
-
1 Attachment(s)
I've used another tool called Process Monitor to capture CDM/ AS SSD. (Used Disk Monitor on the other runs).
The AS SSD file is 20MB. The CDM file is only 2.68MB. (1 pass, 1,000MB test)
I also ran WINSAT and monitored it with Disk Monitor. (Back ground OS xfers included) (1.6MB file size).
Files can be found here:
http://cid-541c442f789bf84e.office.l...x/.Public?uc=1
Top 5 read/ write xfers for WINSAT below. Max xfer size 1MB.
-
1 Attachment(s)
Could do with bit of help to work out what is happening here. :D
If I open the20MB AS SSD xlsx file I linked earlier it takes 15 seconds to open. The file contains 526,446 entries. Process Monitor shows approx 1,284 x 32kB read xfers against Excel.exe, which comes out to ~40MB. Double the file size.
Looking at the I/O log there is a re-occurring pattern of 32kB IRP - Success. 32kB FASTIO Read - Disallowed. The same thing happens when opening a really small xlsx file.
There are around 1,284 x 32kB xfers = 40.125 MB, which occurred over 11 of the 15 second it took to open the file. That averages out at 3.64MB/s.
(Around 1/10th of the available speed). RAM & CPU do not seem to be maxing out.
So......why are reads so slow? Why are the reads duplicated between IRP & FASTIO Disallowed?
According to this "overhead associated with creating an I/O Request Packet (IRP) can dominate the cost of the entire operation – slowing the system down in performance critical areas. Because of this, the NT team introduced the concept of fast I/O. This approach to I/O is used by some file system drivers, such as NTFS, HPFS, FAT, and CDFS as well as the AFD transport driver which is used by WinSock.
The rationale for providing Fast I/O seems to be one of convenience – many I/O operations are repeatedly performed against the same data. For example, like most modern operating systems, Windows NT integrates file system caching with the virtual memory system – using the system memory as a giant cache for file system information. Such systems are extremely efficient and boost both the actual and perceived performance of the system."
-
A typical reason why FAST/IO would be disallowed is that a filter on the system is not allowing it to proceed, as it intereferes with its requirements (for example, I disable it in my filter for network file encryption, since it would not work otherwise).
Run "FLTMC" in the command prompt to check if this is the case. On Widnows Vista/7 there are two default ones: luafs (Virtualization driver) and FileInfo. Anything else (unless it's an antivirus) could be the filter that is disallowing fast I/O on your system.
Applications cannot initiate fast I/O, only drivers can, but non-cached requests from applications would often be ran as a fast I/O first (ad would work unless a filter disables it for some reason, since the file system would complete it by doing a cache manager read/write if needed).
P.S. If you see fast I/O with any benchmark applications, ditch it forever. If it is allowing use of Windows cache, iit is skewing the results.
-
1 Attachment(s)
Hi alfaunits,
Thanks for the reply. Here is what get from CMD. I'd guess MpFilter is MS Essentials, although currently I have disabled real time protection.
I haven't seen fast IO in a benchmark yet. If I used a benchmark to run 1,284 x 32kB in sequence it would come out at around 30MB/s, yet in a real life application I only get 3.64MB/s. :shrug: Is it my system or does an excel file that large normally take so long to open?
I'll run it with hIOmon later to get a better look at what is happening, but in-between any ideas on what is slowing it down so much?
I've placed the log file here:
http://cid-541c442f789bf84e.office.l...x/.Public?uc=1
Lots of interesting entries.
-
With ATTO 1,284 x 32 read xfers are done in 0.2 seconds. To read the same amout of data in the excel file opening example it took 11 seconds. 40MB in 0.2 seconds seems too fast. :shrug:
Each entry is IRP_MJ_READ - Success. There are no FASTIO disabled entries intermixed with the IRP_MJ_READ.
-
40MB in 0.2 seconds = 200MB/s - looks low to me :D
Don't forget Excel is not only reading the file it is also CPU processing it and GPU rendering it (or CPU rendering it).
The overhead of IRP vs. fast I/O is not something that would cause you to notice it on 32KB transfers.
In ProcMon you can right-click on the columns and select to also view "Duration". It is in milliseconds I believe.
-
3 Attachment(s)
Maybe with 4x X25-E :p:
Here I look at what happens with hIOmon when I open that excel file. Device, Volume & Logical disk level.
@ device level:
- 99.5% Fast IOPS Read.
- Busy time 411.35ms.
- Max Read IOPS.....1.6
Seems like I need to upgrade my CPU.
-
Quote:
Originally Posted by
Computurd
i personally find it very strange that the M4 beats the vertex 3 in all of the 'old' anandtech profiles, so then he creates new profiles, in which the V3 wins. strange that.
.
I believe this is because Anand's 2010 bench uses 3Gbps SATA and as he points out earlier in the review the m4's performance is only marginally affected (just 7% difference, if I remember correctly) by whether it's on 3 or 6Gbps sata while the vertex 3 is significantly affected. It seems the m4 is more "bandwidth efficient". I'm guessing this is because it tries to get more of its overall performance from higher randoms as opposed to super high sequentials.
I'd imagine if he redid the 2010 bench on a 6Gbps controller things would look more in line with the 2011 bech, at least the 2011 light load.