Quote Originally Posted by stevecs View Post
As for disabling caching (assuming you mean write cache) this is /always/ a good idea for any cache that is not battery backed (I'm not talking UPS here, I'm talking BBU).
I agree about write cache, but I was actually talking about read cache. In some environments, I've seen measurably better perf with read cache disabled. SQL Server is notorious in that way. Since it has it's own rich cache and manages its own read-ahead, etc, the odds are very small that a remote cache will somehow contain needed data that isn't already cached by SQL Server itself. If the cache was 100% transparent from a performance perspective, it wouldn't matter. Unfortunately, that's not the case for many implementations.

Quote Originally Posted by stevecs View Post
cluster or block sizes of a file system only really have an effect on writes, there is no read effect (as, barring the above read a-heads, only the blocks that the application requests will be read regardless if it's a full cluster or not with the limit being that any device can't read less than the physical sector size of the media (i.e. 512byte generally).
I guess I'm not 100% sure of this, but I'm pretty sure that in Windows at least, the cluster size also determines the minimum physical I/O read size, since clusters are the unit of on-disk addressability, not blocks. Apps that are sloppy about how they read files can benefit from larger clusters, because the data they need for the next N reads can already be in RAM. OS read-aheads tend to only get one cluster ahead, and may use an extra I/O, to avoid delaying the originally requested block.

Quote Originally Posted by stevecs View Post
The problem here is that this is not really the OS doing this, it's just the nature of having multiple file systems so your application is the one that needs to be written to understand some parallelism or give you the opportunity to move structures to different file systems.
I agree about the application side, but some related perf issues are OS oriented. Application start-up time, for example. For an app that requires multiple DLLs, if they are spread out among multiple equal-speed LUNs, the OS will issue parallel I/O requests; otherwise, if they're all on the same drive, the requests are serialized.

Quote Originally Posted by stevecs View Post
RAID-3 is a full stripe raid, it works well for video (large files) and some COW (copy on write) and log based file systems. It does not do well with general user data at all as there is no means to write a partial stripe so if you do that you will take a hit.
It can help for read-heavy apps with certain access patterns. By having all drives active on every read with a QD of 1, read throughput is maximized compared to issuing separate requests to each drive and waiting for them to finish. Of course, this assumes that caching is effective and that the per-device internal striping doesn't interfere.

Quote Originally Posted by stevecs View Post
RAID-4 (and variants) is nothing more than a raid 0 + separate parity disk (or disks). This can have problems with bottlenecking with parity updates and was one reason why raid-5 (distributed parity) became more popular so as to not have the same spindle(s) be a bottleneck.
Bottlenecking is only a problem with QD > 1; otherwise, in both RAID-4 and -5, you're always writing one or more data drives and a parity drive. One potential advantage of RAID-4 vs. RAID-5 on SSD is that you could use SLC for the heavily-written parity drive, which should help maximize array life.

Quote Originally Posted by stevecs View Post
All raids have all spindles working all the time so I don't understand that comment really unless you're referring to IBM's distributed sparing idea (RAID 1E, 5E, et al) where even the spare disk is part of the raid set.
With a RAID-0 or RAID-5 array, with QD =1, if my app issues a request that's a single strip in length or less, only a single drive is active. Only with higher QD or larger request sizes will multiple drives be active. I think this is why many desktop users don't experience performance improvements when they move to RAID: at any instant, they are only using a single drive.

Quote Originally Posted by stevecs View Post
The biggest problem here is the lack of vendor information to explicitly indicate the underlaying mechanisms of the drive so it makes it very hard / lot of leg work to reverse engineer what is going on and then build up. I think it should be standard information on the spec sheet just like the drives block size.
+1. The Intel SSDs have ten internal channels, but Intel apparently still considers the details of how those channels work to be proprietary. I have some ideas on how to reverse engineer it, but as you said, it's time consuming.