Page 9 of 16 FirstFirst ... 6789101112 ... LastLast
Results 201 to 225 of 376

Thread: hIOmon SSD Performance Monitor - Understanding desktop usage patterns.

  1. #201
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    One_Hertz thread on the FusionIO got me thinking about file transfer speeds.

    Here I experiment copying a 712,394KB file from the C drive to the E drive (Both X25-M160GB drives).

    In the first image I use Window’s Explorer to copy the files from the C drive to the E drive and then I use TeraCopy.

    In the second image I copy the same file from the E drive to the E drive, first using TeraCopy and then with Explorer.

    I’m not 100% sure how to interpret the results, but it seems a file copy operation using Window’s is done with cache. I’ve seen, or more accurately heard; that the transfer speed Windows reports/displays is cache speed, as a hard drive can be heard clanking away long after Windows reports the file transfer as being complete.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	LR comp.jpg 
Views:	319 
Size:	162.0 KB 
ID:	110542   Click image for larger version. 

Name:	copy comp.png 
Views:	316 
Size:	79.5 KB 
ID:	110543  
    Last edited by Ao1; 12-23-2010 at 09:42 AM.

  2. #202
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    Here I look at the write performance from C drive to E drive.

    hIOmon and TerraCopy report more or less exactly the same. Write speeds of ~105MB/s.

    Windows shows a transfer speed of 222MB/s, which seems to be reflective of the cache read speed not the write speed.

    EDIT: Looking a little more closly Windows reports 1 item remaining with a file size of 29.6MB. That is the exact size of the Write Xfer Max size that hIOmon reports, so it seems that when you copy a file with Windows it is split into smaller file sizes during the copy processes. When using TerraCOPY the file transfer is made in one sequential Xfer.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	A.png 
Views:	315 
Size:	24.3 KB 
ID:	110544   Click image for larger version. 

Name:	Write copy speed windows.png 
Views:	312 
Size:	45.8 KB 
ID:	110545  
    Last edited by Ao1; 12-23-2010 at 10:15 AM.

  3. #203
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    This article helps explain the copy process in Windows:

    File Cache Performance and Tuning

    http://technet.microsoft.com/en-us/l.../bb742613.aspx

    1 = The Copy interface copies data from the system cache into an application file buffer
    2 = A page fault occurs when the Cache Manager tries to access data not in cache memory
    3 = Dirty pages in the system cache are written to disk by lazy write system worker threads Physical disk updates

    The CacheSet Utility – A tool to provide a simple control for setting the minimum and maximum system working set size:

    http://technet.microsoft.com/en-gb/s...rnals/bb897561
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	1.png 
Views:	310 
Size:	23.1 KB 
ID:	110593   Click image for larger version. 

Name:	2.png 
Views:	310 
Size:	28.0 KB 
ID:	110594   Click image for larger version. 

Name:	3.png 
Views:	317 
Size:	18.1 KB 
ID:	110596  
    Last edited by Ao1; 12-25-2010 at 02:25 AM.

  4. #204
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    There was probably a much easier way of do this, but anyway. In the first shot I monitored what occurred during a 4K AS SSD Benchmark. (I disabled the other tests from running). Theory being that this would show the max IOP capability with 4K reads and writes.

    The maximum IOP read rate was 5,037, write 9,872.

    Next I monitor a couple of hour’s general usage. I ran a SP & MP game and most of the programs I typically use.

    The maximum IOP read rate was 1,788, write 236.

    Next I update the AV data base and run a full AV scan, without resetting the general use metrics.

    The maximum IOP read rate was 3,484, write 375.

    It seems that even with an average queue depth of one I can’t get close to using the available read IOP’s capability of my SSD. As for write IOPs….
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	IOP.png 
Views:	278 
Size:	112.2 KB 
ID:	110651  
    Last edited by Ao1; 12-27-2010 at 02:11 PM.

  5. #205
    SLC
    Join Date
    Oct 2004
    Location
    Ottawa, Canada
    Posts
    2,795
    Quote Originally Posted by Ao1 View Post
    It seems that even with an average queue depth of one I can’t get close to using the available read IOP’s capability of my SSD. As for write IOPs….
    Keep in mind these measurements are in a per second basis. I.e. 5k I/Os served in 0.5s (perhaps that was all which was required) and then 0 i/os in the next 0.5s would show as 2500 IOPs total. This does not necessarily mean that your SSD is doing the best that can possibly be done. Not that I disagree, it is just that your particular test does not prove it.

  6. #206
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    Quote Originally Posted by One_Hertz View Post
    Keep in mind these measurements are in a per second basis. I.e. 5k I/Os served in 0.5s (perhaps that was all which was required) and then 0 i/os in the next 0.5s would show as 2500 IOPs total. This does not necessarily mean that your SSD is doing the best that can possibly be done. Not that I disagree, it is just that your particular test does not prove it.
    Good point. I already know that most IOPS (from monitoring the percentage FastIOPcounts) are less than one millisecond. The mistake with SSD is to think in seconds. It’s the same with MB/s. For write speeds it’s possible with the X25-M to see 250MB/s for a single I/O write operation when it is done in less than a second. Here is where it would be interesting to see how other SSD’s perform.

    What I try to establish above is the percentage of IOPs utilisation compared to what is available.

  7. #207
    Xtreme X.I.P.
    Join Date
    Apr 2008
    Location
    Norway
    Posts
    2,838
    I think hIOmon is as good as it gets, at least for now.
    I've seen the same "flaw" where throughput is greater than the SSD can possibly deliver, not a big issue imho, still it is there.

    Wrt copy operations, I've tried both TeraCopy and Total Commander.
    Total Commander (bigfile + a decent buffer ) is the one to use, it outperforms TeraCopy in most situations.
    -
    Hardware:

  8. #208
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    Hi Anvil, what is the highest file copy speed you have been able to achieve? (EDIT) and how did it compare to the theoretical speed?
    Last edited by Ao1; 12-27-2010 at 03:40 PM.

  9. #209
    Xtreme X.I.P.
    Join Date
    Apr 2008
    Location
    Norway
    Posts
    2,838
    I haven't really tried pushing the envelope, most copy operations are 250-350MB/s (from SSD to HDD where I do my daily "backups")

    As a test case I was thinking of trying 4-5 SF drives as the destination and 4 Intels as the source drive, if that doesn't break the 500MB/s "limit" nothing will
    (using a large file that's easily compressible the SF array should in theory be able to write close to 1GB/s)
    -
    Hardware:

  10. #210
    SLC
    Join Date
    Oct 2004
    Location
    Ottawa, Canada
    Posts
    2,795
    Just reiterating what I found:

    ICH X25-E array can read at 640MB/s in iometer
    IOdrive can write at 582MB/s in iometer

    Using total commander I can copy from ICH to IOdrive at 470MB/s

  11. #211
    Xtreme X.I.P.
    Join Date
    Apr 2008
    Location
    Norway
    Posts
    2,838
    Is that using 4 E's? and what about the other way around? (from the IOdrive to the ICH array)

    I was thinking of doing a test from raid-controller to raid-controller but I'll throw in the ICH10R as well (as a source "drive").
    -
    Hardware:

  12. #212
    Xtreme Member
    Join Date
    May 2010
    Posts
    112
    Regarding several of the "maximum" I/O operation performance metrics collected by the hIOmon software, here are a couple of "nuances" that folks might want to consider.

    The "maximum" IOPS metric reflects the actual number of I/O operations (e.g., read I/O operations obviously in the case of the "max read IOPS" metric) observed by the hIOmon software during a one-second interval since the start of the time duration when the hIOmon software began collecting I/O operation performance data for the respective file, device, or process.

    For example, if the hIOmon software saw 5000 read I/O operations performed during a one-second interval (and this was the maximum number of read I/O operations observed so far within a one-second interval), then the reported value would be 5000.

    As One_Hertz suggested above, it could be that all 5000 of these read I/O operations were performed during the "first" half-second of the one-second interval. One could subsequently argue that the "potential" maximum rate is really 10000 (since if the device did 5000 in 1/2 second, then it "should" be able to do 10000 within a full second).

    In any case, the reported maximum IOPS values reflect that which was actually observed by the hIOmon software based upon actual device usage - and not necessarily the maximum that the device (for example) can perform.

    Likewise, the maximum MB/s metric (i.e., the maximum amount of data transferred during a one-second interval) and the "maximum MB/s for a single I/O operation" metric are both based upon actual observations of the hIOmon software during the course of monitored file/device usage. [EDIT] A short description of each of these throughput metrics (and an important distinction between them) can be found at this prior post:

    http://www.xtremesystems.org/forums/...3&postcount=64

    One other quick note. The maximum metrics described above can sometimes appear to exceed the capability of the device.

    Take, as an example, a read I/O operation whose data transfer can be entirely satisfied by the OS without requiring a data transfer directly with the device. In this case, the data transfer is basically a memory transfer operation performed "internally" by the OS without interaction with the device (and so consequently not subject, for instance, to the bandwidth limitations of the actual device interface).

    And as a result, the performance characteristics of the read I/O operation (as observed by the hIOmon software based upon the starting time and completion time of the I/O operation along with the amount of data transferred) can appear "exaggerated" in light of the performance expectations/limitations of the corresponding device.
    Last edited by overthere; 12-28-2010 at 01:49 PM. Reason: Add pointer to throughput metric descriptions

  13. #213
    Xtreme X.I.P.
    Join Date
    Apr 2008
    Location
    Norway
    Posts
    2,838
    Quote Originally Posted by overthere View Post
    ...
    In any case, the reported maximum IOPS values reflect that which was actually observed by the hIOmon software based upon actual device usage - and not necessarily the maximum that the device (for example) can perform.
    ...
    This is why I like the hIOmon software, it reports what is observed.

    If I want to get the maximum iops I'd run iometer or sqlio for e.g. 10 seconds, that would give me the maximum iops.
    For monitoring tasks that aren't synthetic one needs to pair several metrics in order to try to "fully" understand what took place.

    As for the observations that may be "polluted" by cache, well, if the cache works it works for your benefit, cache is part of the system and it doesn't invalidate the result.
    Last edited by Anvil; 12-28-2010 at 04:48 AM.
    -
    Hardware:

  14. #214
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    The effect of system file cache was something I had not really understood, but as I re-read posts I can begin to understand it better.

    A block diagram showing a data flow path at each level that hIOmon can monitor would be really helpful to provide a visual concept of what occurs where and why it is important.

    Anvil…I’m looking forward to seeing your copy speeds.

  15. #215
    Xtreme X.I.P.
    Join Date
    Apr 2008
    Location
    Norway
    Posts
    2,838
    I havent installed hIOmon yet and so this is without any monitoring.

    Copy 5GB file (iometer test file) using Total Commander (bigfile 8MB buffer)

    Target : 4R0 Vertex 2 60GB on the 9260

    Source : 3R0 Vertex 2 100GB on the ICH
    ICH_3R0_VERTEX2_TO_9260_4R0_V260GB_607MBS.png

    Source : 2R0 C300 64GB on the ICH
    ICH_2R0_C300_TO_9260_4R0_V260GB_480MBS.png

    I'd say the 3R0 Vertex 2 array is about as fast as it gets using the ICH as the source.

    I'll install hIOmon before I continue...

    edit:
    Rerun of the 3R0 to 4R0 copy operation, I didn't capture the transfer speed this time but it was more or less exactly the same. (607.nn)

    source_3r0_ICH_V2.PNG

    target_4r0_V2_9260.PNG
    Last edited by Anvil; 12-28-2010 at 12:12 PM.
    -
    Hardware:

  16. #216
    Xtreme X.I.P.
    Join Date
    Apr 2008
    Location
    Norway
    Posts
    2,838
    Now, the other way around

    Source : 4R0 Vertex 2 60GB on the LSI 9260
    Target : 3R0 Vertex 2/LE 100GB on the ICH

    9260_4R0_V2_60GB_TO_ICH_3R0_V2.png

    hIOmon summaries....

    source_4R0_9260.PNG

    target_3R0_V2_ICH.PNG
    Last edited by Anvil; 12-28-2010 at 12:38 PM.
    -
    Hardware:

  17. #217
    Xtreme Member
    Join Date
    May 2010
    Posts
    112
    Quote Originally Posted by Anvil View Post
    As for the observations that may be "polluted" by cache, well, if the cache works it works for your benefit, cache is part of the system and it doesn't invalidate the result.
    The system file cache is certainly an integral part of the overall system. Unfortunately it seems that its performance impact - and implications - can be neglected and/or misunderstood when considering an application's I/O activity.

    And then there is that old (database) adage: "The best I/O is no I/O"

  18. #218
    Xtreme Member
    Join Date
    May 2010
    Posts
    112
    Quote Originally Posted by Ao1 View Post
    A block diagram showing a data flow path at each level that hIOmon can monitor would be really helpful to provide a visual concept of what occurs where and why it is important.
    Perhaps the figure at the bottom of the following web page is along the lines that you have suggested:

    http://www.hyperIO.com/hIOmon/hIOmonArch.htm

    Following from this figure, it might also be worth pointing out the ability of a single software tool to:

    1. Observe I/O operations in action at three different critical points within the OS I/O stack: the file system level (essentially the application/process level), the physical volume level, and the physical device level.

      This can help provide for a more complete, overall picture of what actually occurs during I/O operation processing.

    2. Selectively observe I/O operations at one or more of these three levels.

      This can help identify those particular I/O operations that are, for instance, germane to only a single level, e.g., I/O operations that are satisfied directly by the use of the system file cache and thus are effectively limited to the file system level only, I/O operations that are issued by applications/programs directly to a device at the physical device level and without direct interaction with the file system level, etc.

    3. Optionally observe I/O operations concurrently at two or more levels, and moreover correlating I/O operations as they traverse the different levels.

      This can help identify, for example, the nature and extent to which a particular application's file I/O operation activity actually incurs I/O operations at the device itself.

  19. #219
    Xtreme Member
    Join Date
    May 2010
    Posts
    112
    I thought that it might be helpful to provide several brief comments about the summary I/O operation performance metrics shown within Ao1's prior post #201, where he compared file transfer speeds between TeraCopy and Windows Explorer.

    Basically, the summary metrics (which were collected upon a periodic basis for each of the respective files) displayed by the hIOmon Presentation Client for the overall summary metrics reflect the combined total of the opened "instances" of the file, whereas the displayed tabular metrics (presumably from the CSV-formatted hIOmon Manager Export File) reflect a separate row for each opened instance of the respective file (which is why there are sometimes multiple entries for the same file within the displayed tables, with each entry for the same file basically representing a separate "instance" of the file).

    So what's an "instance" of a file? The hIOmon software can optionally collect and maintain summary I/O operation performance metrics upon an individual specific file, device, and/or process basis.

    In the case of files, it is possible that a given file will be concurrently accessed by, for example, more than one process (e.g., Windows Explorer, the System process, and/or an AntiVirus program).

    Normally upon "opening" the file, each such process will receive a "file handle", which essentially represents its particular open "instance" of the same file. Accordingly, the hIOmon software collects and maintains separate summary I/O operation performance metrics for each such separate "instance", with these summary metrics reflecting the monitored I/O operations directed towards the respective instance of the file. (Please note that the hIOmon software also maintains some summary metrics upon an overall file basis, i.e., regardless of the particular instance of the file).

    As a second comment, Ao1 is correct about the Windows Explorer file copy operation involving the system file cache. In particular, the hIOmon Presentation Client display illustrates that the data to be copied was written to the system file cache (q.v., the write "SystemCache" data transfer amount), then transferred to the device proper (q.v., the overall amount of data written was twice the size of the "TEST file.mpg" file). This is in accordance with the information Ao1 presented in his post #203

    As a third comment, the sum of the random I/O operation count and the sequential I/O operation count will be one less than the associated reported total I/O operation count for the respective file/device (since the first I/O operation to the file/device is considered to be neither a random nor a sequential I/O operation). In turn, the sum of the random data transferred amount and the sequential data transferred amount will be less than the total data transferred amount (i.e., the data transferred by the first I/O operation to the file/device is likewise considered to be neither random or sequential data transferred).

    Lastly, the hIOmon software presents its throughput metrics in terms of MB/s (i.e., megabytes-per-second, with one megabyte = 1 000 000 bytes) rather than MiB/s (i.e., mebibytes-per-second, with one mebibyte = 1 048 576 bytes).

  20. #220
    Xtreme X.I.P.
    Join Date
    Apr 2008
    Location
    Norway
    Posts
    2,838
    I've been doing a bit of tests using the LSI 9260 and the PERC 6/i.
    When using the 9260 as the target, copy speed is 700+MB/s but the other way around the results are varying a lot, from 400-450MB/s.

    perc_4R0_V2_100GB_TO_LSI9260_4R0_V260_744MBS.png

    The PERC is perfectly capable of reading at full speed (1GB/s) but writing is slower than expected.

    I'll swap the PERC for an LSI 9211 using SW raid.
    -
    Hardware:

  21. #221
    Xtreme X.I.P.
    Join Date
    Apr 2008
    Location
    Norway
    Posts
    2,838
    While preparing for the copy tests I've made a few checks using ATTO for sequential bottlenecks.

    Here's the 9260 array

    atto_9260_2GB.PNG

    atto_9260_2GB_DR6.PNG

    ...

    I'll updload the 9211 SW test a little later.
    -
    Hardware:

  22. #222
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    Hi Overthere. You have the patience of a saint. I can see that some of the items I’m now looking at were covered in much earlier posts, but I did not twig on to what you were saying, it went right over my head.

    Let me see if I can test your patience further

    To better understand the three monitoring levels I thought a “simple” file copy would help elaborate what happens where and when.

    I’ve noticed that if you right click to copy a file, then delete the copied file and then immediately recopy it, the transfer appears to be instantaneous.

    This scenario would therefore seem to give me a way of tracking what happens in the file system and what actually happens on the disk and when.

    To do this I selected:
    • 1 (Device and file I/O Performance Analysis).
    • E:\*
    • 1 (Enable collection of Physical Device Extended Metrics)

    Any which way it‘s going to be covered.

    E drive is static data only. The Test File is 731,082,752 bytes.

    The copy/ delete/ copy process was done as follows:
    • Copy file (right click) Test.avi from E:\Test File to E:\ (paste)
    • Move copied file to recycle bin (right click, delete)
    • Re paste file to E:\ (right click paste)

    The above was carried out within a minute, but interesting it would appear that reads and writes generated by the file transfer occurred sometime later on the actual device, but back to that later.

    I like the diagram from the TechNet article as it shows the flow of data. If I use that diagram to try and understand what is happening from the monitoring results below:

    All uncoloured entries are activities occurring between stages 1 and 5.

    Assumptions:
    • Performance in these stages is down to the OS and RAM. Nothing is occurring on the device.
    • The system cache IOP miss counts are page faults from RAM, which then require a re-read from the file system.
    • Most of the Read/ Write FastIOPcounts are done in cache and when this happens there are independent of the device speed. If I look at the results from post #48 it is likely that the 87.6% FastIOPcounts with HDD were predominantly achieved by cache. The SSD managed 100% because the speed of the SSD was such that it could satisfy non cached read IOP counts in less than a millisecond. [Edit] So it seems activities from 1 to 5 are designed to mitigate the slow speed impact of a HDD. I wonder if the file system driver/ cache manager could be better optimised for SSD. (?)

    \Device\HarddiskVolume3 is when data actually gets read and written to the device. I assume stage 7 - Disk Driver.

    Assumptions:
    • This is when the speed of the device counts, however when this actually occurs is typically “after the event”. The speed of the device only therefor counts if it is a direct read/ write that bypasses the cache manager. (Assuming no further load when the read/ write occurs)

    E: is a summary of all activity within stages 1 and 7

    Assumptions:
    • This is showing all activity be it in cache or on the device.

    \Device\HarddiskVolume3:\<DASD>\VMMFrom previous posts this has been explained as “system flags associated with the I/O operation indicate involvement by the system “Virtual Memory Manager (VMM)”.

    Assumptions:
    • Entry 21 is why the re-paste appeared to be instantaneous. It was copying data from the VMM to the file system driver. The actual read/ write on the device occurred a minute later.

    I’ve attached the script, which contains hidden cells (that can be un-hidden if required) that are not shown on the image below. I only wanted to show relevant data to what I try to describe above, but if I missed anything it can be un-hidden.

    Please feel free to wade into mistakes in my understanding and I will correct the post accordingly.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	Block.png 
Views:	228 
Size:	102.4 KB 
ID:	110680   Click image for larger version. 

Name:	file copy.png 
Views:	153 
Size:	180.5 KB 
ID:	110982  
    Attached Files Attached Files
    Last edited by Ao1; 01-09-2011 at 01:39 AM. Reason: Added IOP Read Column

  23. #223
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    Here is a summary of the top largest read transfers after a couple of hours of normal general use. The entries are arranged by size, not by when they occurred.

    Max read cache xfer speed = 3,219MB/s.

    Next up I look at the smallest read transfers.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	Cache read speedsLR.png 
Views:	224 
Size:	99.8 KB 
ID:	110703  

  24. #224
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    Here are the bottom lowest read transfer sizes, again arranged by transfer size. There are no physical disk reads. Looking at the rest of the log file physical disk reads only start to occur when read transfers are 4,096 bytes or above….hmm. Now I check writes.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	Small Cache read speedsLR.png 
Views:	206 
Size:	85.2 KB 
ID:	110704  

  25. #225
    Xtreme Mentor
    Join Date
    Feb 2009
    Posts
    2,597
    Here are writes, same method as reads. Fastest cache write speed = 2,046MB/s

    Out of 25,288 write entries only 1,079 had a write to the physical device.
    So it would seem that the SSD does not see anything like the small writes that are generated, as they resolved in the file system without going to disk. I guess on top of that the SDD itself can further cache small writes to reduce wear.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	Large writes.png 
Views:	194 
Size:	103.8 KB 
ID:	110705   Click image for larger version. 

Name:	Small cache.png 
Views:	192 
Size:	118.6 KB 
ID:	110706  

Page 9 of 16 FirstFirst ... 6789101112 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •