MMM
Results 1 to 25 of 68

Thread: Not getting more than 65000 IOPS with xtreme setup

Hybrid View

  1. #1
    Xtreme Addict
    Join Date
    Jul 2006
    Posts
    1,124
    Yeah, I've run into that problem for all platforms/vendors. I've even seen various 'stupid admin' designs where they put say a serial card in a 64bit/133mhz slot but then put the SAN hba in a 32bit/33mhz one. The board I've go my eyes on now is the Supermicro X8DAH+ as it has two tylersburg-36D chipsets on it (only one I've found so far w/ two). For I/O it's going to be a killer if they don't castrate something else. You don't really need more than 8x PCIe v1 speeds as of yet as the cards that are available can't handle more than that anyway (IOP34x are maxed out). What you're doing is generally what is done (putting multiple cards in a system and lvming/striping across them). Here at at the datacenter we don't have any SSD's really deployed to any of the clients, the largest bank of drives we have is about 500-600 (not including sans) but even that large one is not for a single db instance and is hooked up via 4gbit fc so there are a lot of built-in bottlenecks (doesn't help matters that they bought a lower-end sun box (should have had at least a T5440 for the sparc line).

    |.Server/Storage System.............|.Gaming/Work System..............................|.Sundry...... ............|
    |.Supermico X8DTH-6f................|.Asus Z9PE-D8 WS.................................|.HP LP3065 30"LCD Monitor.|
    |.(2) Xeon X5690....................|.2xE5-2643 v2....................................|.Mino lta magicolor 7450..|
    |.(192GB) Samsung PC10600 ECC.......|.2xEVGA nVidia GTX670 4GB........................|.Nikon coolscan 9000......|
    |.800W Redundant PSU................|.(8x8GB) Kingston DDR3-1600 ECC..................|.Quantum LTO-4HH..........|
    |.NEC Slimline DVD RW DL............|.Corsair AX1200..................................|........ .................|
    |.(..6) LSI 9200-8e HBAs............|.Lite-On iHBS112.................................|.Dell D820 Laptop.........|
    |.(..8) ST9300653SS (300GB) (RAID0).|.PA120.3, Apogee, MCW N&S bridge.................|...2.33Ghz; 8GB Ram;......|
    |.(112) ST2000DL003 (2TB) (RAIDZ2)..|.(1) Areca ARC1880ix-8 512MiB Cache..............|...DVDRW; 128GB SSD.......|
    |.(..2) ST9146803SS (146GB) (RAID-1)|.(8) Intel SSD 520 240GB (RAID6).................|...Ubuntu 12.04 64bit.....|
    |.Ubuntu 12.04 64bit Server.........|.Windows 7 x64 Pro...............................|............... ..........|

  2. #2
    Registered User
    Join Date
    Feb 2009
    Location
    Europe, Amsterdam
    Posts
    43
    Quote Originally Posted by stevecs View Post
    The board I've go my eyes on now is the Supermicro X8DAH+ as it has two tylersburg-36D chipsets on it (only one I've found so far w/ two). For I/O it's going to be a killer if they don't castrate something else.
    It sounds interesting, thanks for the tip.

    You don't really need more than 8x PCIe v1 speeds as of yet as the cards that are available can't handle more than that anyway (IOP34x are maxed out).
    Indeed, which is something else I don't understand. There don't seem to be any faster RAID cards coming any time soon. At least, no announcements have been made by either Intel, Areca or Adaptech. Clearly many of us are hitting a wall with the speed offered by the current generation of IOPS, but there doesn't seem to be any improvement in the short term.


    Here at at the datacenter we don't have any SSD's really deployed to any of the clients, the largest bank of drives we have is about 500-600 (not including sans)
    Wow, that's something nice indeed to play with :P

    I did have the opportunity to run some more tests today. I started with doing tests for an 8KiB request size. The numbers turned out to be significantly lower. Just to be sure that something else hadn't changed in the system (as I mentioned, my co-worker did some tests too), I re-run all the 4KiB tests and for every number of threads (queue depth) the same results as before were reported. To be really sure I then again re-run all the 8KiB tests (taking about 1 hour), but these gave exactly the same results as the earlier run.

    Here they are. This is again for 8 disks per raid controller, 2 controllers, lvm stripped, 8KiB array stripe size, NOOP scheduler, average over 10 passes and thus 8KiB request size this time. I specifically checked that none of the passes were out of range, and none were. Every pass reported nearly the exact number of IOPS.

    Code:
                           T  Q   Bytes          Ops        Time      Rate        IOPS        Latency    %CPU   OP_Type     ReqSize 
    ^MTARGET   Average     0   4  171798691840   20971520   1020.919  168.278     20541.81    0.0000     0.01   read        8192
    ^MTARGET   Average     0   8  171798691840   20971520   702.962   244.393     29833.08    0.0000     0.03   read        8192 
    ^MTARGET   Average     0  16  171798691840   20971520   594.507   288.977     35275.51    0.0000     0.08   read        8192 
    ^MTARGET   Average     0  32  171798691840   20971520   560.851   306.318     37392.34    0.0000     0.18   read        8192 
    ^MTARGET   Average     0  64  171798691840   20971520   548.917   312.978     38205.30    0.0000     0.38   read        8192
    ^MTARGET   Average     0 128  171798691840   20971520   545.725   314.808     38428.76    0.0000     0.82   read        8192 
    ^MTARGET   Average     0 256  171798691840   20971520   545.344   315.028     38455.59    0.0000     2.29   read        8192
    As can be seen, in this case we already almost max out for 16 threads. Going beyond that only very marginally increases the IOPS.

    For completeness I also tested with block size 2KiB, although that size isn't really important for my live load:

    Code:
                           T  Q   Bytes          Ops        Time       Rate        IOPS        Latency    %CPU   OP_Type     ReqSize 
    ^MTARGET   Average     0   4  171798691840   83886080   2856.671    60.139     29364.98    0.0000     0.01   read        2048 
    ^MTARGET   Average     0   8  171798691840   83886080   1813.242    94.747     46263.03    0.0000     0.05   read        2048 
    ^MTARGET   Average     0  16  171798691840   83886080   1395.675   123.094     60104.29    0.0000     0.16   read        2048 
    ^MTARGET   Average     0  32  171798691840   83886080   1258.457   136.515     66657.87    0.0000     0.40   read        2048 
    ^MTARGET   Average     0  64  171798691840   83886080   1216.341   141.242     68965.91    0.0000     0.90   read        2048 
    ^MTARGET   Average     0 128  171798691840   83886080   1205.534   142.508     69584.15    0.0000     1.91   read        2048 
    ^MTARGET   Average     0 256  171798691840   83886080   1217.721   141.082     68887.78    0.0000     3.90   read        2048
    In this case IOPS increase until 32 threads, but very slightly decrease after 128. With 128 threads I did notice an awkward a-synchronicity in the numbers reported by iostat:

    Code:
    Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
    sda               0.00     0.00    0.00    0.13     0.00     0.00     8.00     0.00    4.00   2.00   0.03
    sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda2              0.00     0.00    0.00    0.13     0.00     0.00     8.00     0.00    4.00   2.00   0.03
    sdb               0.00     0.00 35684.80    0.00    68.09     0.00     3.91    66.00    1.85   0.03 100.00
    sdc               0.00     0.00 35833.27    0.00    68.39     0.00     3.91     4.30    0.12   0.03  99.07
    dm-0              0.00     0.00    0.00    0.13     0.00     0.00     8.00     0.00    4.00   2.00   0.03
    dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    md0               0.00     0.00 73157.07    0.00   139.68     0.00     3.91     0.00    0.00   0.00   0.00
    avgqu-sz and await for sdb is way more than 10 times higher than that for sdc. With 256 threads I saw the same things. With 64 threads the difference was there too, but a little less (almost exactly a factor 10):

    Code:
    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               0.68    0.00   17.21   69.68    0.00   12.42
    
    Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
    sda               0.00     0.00    0.00    0.20     0.00     0.00     8.00     0.00    2.67   1.33   0.03
    sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda2              0.00     0.00    0.00    0.20     0.00     0.00     8.00     0.00    2.67   1.33   0.03
    sdb               0.00     0.00 35177.27    0.00    67.12     0.00     3.91    47.15    1.34   0.03 100.00
    sdc               0.00     0.00 35127.20    0.00    67.04     0.00     3.91     4.17    0.12   0.03  98.75
    dm-0              0.00     0.00    0.00    0.20     0.00     0.00     8.00     0.00    2.67   1.33   0.03
    dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    md0               0.00     0.00 71920.27    0.00   137.31     0.00     3.91     0.00    0.00   0.00   0.00
    During the 64 threads run, vmstat showed this:

    Code:
    procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
     r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
     3 40   1572 12645792     52 15683252    0    0  9715  1252   20   17  0  3 85 12
     7 63   1572 12645776     52 15683252    0    0 138445     0 50310 276128  1 17 12 70
     4 46   1572 12645776     52 15683252    0    0 138288     0 50262 276661  0 17 13 69
     1 53   1572 12645768     52 15683252    0    0 138387     0 50337 274764  1 17 13 69
     2 40   1572 12645768     52 15683252    0    0 135370     0 49442 268276  2 17 13 68
     2 55   1572 12645760     52 15683252    0    0 138110     0 50192 275198  1 18 11 71
     3 45   1572 12645760     52 15683252    0    0 138575     0 50218 275155  1 17 12 70
     5 53   1572 12645760     52 15683252    0    0 135070     1 49743 264514  1 17 14 69
    Last edited by henk53; 03-11-2009 at 01:25 PM.

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •