Interesting results and the potential improvement using larger stripe sizes on the areca with flash though it's hard to really tell with the caching turned on, if that also holds with it off it may be due to cell/blocking on the SSD's being large.

Assuming we are still not running into a controller or driver bottleneck and if space is less of a concern than performance, try raid-10 or a bunch of raid-1's lvmed & striped. See here for an older post of mine that shows some of the benefit:
http://www.xtremesystems.org/forums/...6&postcount=82
especially if you are heavy in doing reads. The duplication of data allows for the controller to order requests to drives better to handle overlapped requests.

I like that with the dual controller test you have now is showing symmetry with the queues. That's good, though I don't understand why this would not have carried through when using 3 cards. Even with different amounts of cache which would only be in play for write tests (assuming you turning off the read ahead cache). Unless there was a tuning issue where it was mis-alligned, of other delays going off the southbridge chip (interrupts; bandwidth, et al).

As for streaming performance to be decreased without read cache, that is to be expected and is the correct behavior. Streaming I/O is at the other end of of the tuning scale, a system tuned for streaming won't do well for random i/o and visa-versa.

Also, shot you a pm here, I have a new beta of XDD that should fix the cpu detection routines and some work done to see about the averaging problem (most likely in the timer granularity with very high iops). It's too big to attach but I can e-mail it to you ~2MiB