The problem is generally the test environment. What Serra observes is very likely due to both a stripe size that is too large and not enough outstanding commands in the queue. Properly sizing an array is not as simple as just throwing disks at it (though even our SAN team here has fallen into that trap numerous times). There is no single correct setup/design that will work for all situations. Empirical evidence is very alluring but is very limited in application. (I always think of it as the separation between physicists and engineers. :P )