Multi-HD vs FILE Configuration

Printable View

Show 100 post(s) from this thread on one page

02-27-2008, 08:17 PM
james bennett

sounds frustrating

Thats got to be a drag (having such a good hard drive only to be held back by the controller). I get all geeked whenever I'm about to get a new component that will increase performance. I'm counting the days till I order this SSD.
02-27-2008, 08:32 PM
jcool

Haha, that's totally how I feel :(
Which SSD you going to order?
02-27-2008, 10:17 PM
james bennett

SSd

I was looking at the mtron MSD6000 its only 16gb which isn't much but, all the data on my computer right now totals 14gb. If I need to put more I can always fall back on my raptor. That is until this fall when that extra cash for income tax comes through. Then I'm getting another SSD and running it in raid0. The main thing I absolutely have to have on the SSD is my O.S. and call of duty 4.
02-28-2008, 05:40 PM
IanB

Quote:

Originally Posted by virtualrain

I monitor my physical RAM usage and my PAGE file usage and right now with 4GB of physical RAM it shows I'm using 1.3GB of physical and 2.6GB of page file.

This is because Vista tries to balance the use of physical RAM between apps, disk cache and superfetch. It may decide that some of your physical RAM is better utilized loading an often used APP into memory and paging out some crap you haven't used in awhile. XP is more crude in this regard.

I'd agree that Vista may need a different approach to pagefile monitoring and setup, my comments are based on older versions of Windows that don't have this Superfetch behaviour.

Quote:

As for disk configurations... here's what I would suggest... in the interest of parallelism, you want 3 different physical drives...

1) OS and Apps. The OS is used to boot, the apps are used after boot. No sense putting these on different drives since they aren't accessed at the same time in general.

I'd strongly disagree here. The reason is the API (Application Programming Interface) for Windows.

When you load an app, it doesn't contain all the code for running the program. Much of the app is calling DLLs (Dynamic Link Libraries) and other external code modules to perfom I/O and user interface work, some provided by the application but most are provided by Windows itself. This is why all Windows apps have much the same "look and feel", they use the same modules to provide the buttons, checkboxes, scrollbars, cursor behaviour etc.. The surface appearance of the app is only a very very small part of the API.

If you disassemble an app, you find near the end a list of all the API and external DLL modules it needs to function. Therefore, when the app is loaded into memory, the first thing it must do is signal to the OS which of these it needs. Obviously some basic I/O DLLs are so common that they will always be resident in memory (kernel.DLL for instance), but others may be a little more esoteric and have to be loaded in from your C:\Windows\System32 folder as required. And remember, there may be proprietary DLLs provided as part of the app setup, which live in the app's own folder.

So the upshot is that loading an app means that before it can run there will be multiple file reads from both its own folder and the Windows\System32 folder and probably other places as well as it loads all the necessary code modules. And since modular programming is generally the best way to manage a large app, the larger the app the more code will be modular and external, so the more cross-loading there will be, which is why loading a large app is so slow. To really speed this up, you DO need the app and OS on different disks.

Then you have the pagefile in play, as some of that data being requested to load the app may have been dumped from RAM to pagefile earlier in the session, so it will be reloaded from THAT disk area. Or the OS decides it needs to dump current memory contents to pagefile to make room for the loading app, so it will be writing there instead. And all this is happening concurrently with whatever your OTHER apps are doing - for instance, a Usenet, torrent or FTP app may be doing its own file reading/writing separately in the background using yet another disk area.

Here's an ideal schema for partitioning a system to achieve maximum drive usage and response speeds. The drives here can be RAIDed sets, but the separation of function is best maintained by using three drives like this. The arrows show which disk partitions might be used concurrently. Obviously the D/E/F data areas can be subpartitioned as required for better organisation. In this scheme I'm assuming that some apps may be part of the OS installation or installed into the OS partition (eg. Outlook Express) so data (eg. email) used by that app (here shown as E) should be on a different disk to the OS. The least accessed data should be on the same disk as the OS, any frequently accessed data (torrents/usenet download area etc.) on a completely separate disk. The order of preferred disk speeds is obviously fastest to slowest, from top to bottom.

http://www.i-asm.com/drivescheme.gif
02-28-2008, 06:31 PM
virtualrain

Quote:

Originally Posted by IanB

So the upshot is that loading an app means that before it can run there will be multiple file reads from both its own folder and the Windows\System32 folder and probably other places as well as it loads all the necessary code modules.

Good point, I understand what you are saying in theory... but I think there are other factors which will reduce the benefit of separating the OS and the Apps in a practical situation...

One factor is that most windows code is loaded at boot... most of the DLL's you are considering here are actually used by the OS, not just the apps you will load later, so most of it is already in memory or the page file before you start your app.

Secondly, as many have found, loading an OS or an application is not simply a disk constrained operation... it's very unlikely that loading any app saturates your HD I/O (unless your HD is a dog) because the code that's loading actually has to execute as well. This is why RAID-0 arrays don't scale well for OS loading time benchmarks, because loading an OS is typically not a disk constrained task. Nor is loading an app. This is different from loading a video file where it simply has to transfer from disk to memory and can easily saturate your disk throughput.

I guess I'd be interested to see the real-world benefits of separating the OS from Apps... I just can't imagine a performance boost going from both OS and Apps on a RAID-0 array of Raptors on an Areca card to having them on separate JBOD disks... would the difference be even measurable?

Having said all that, I suspect the real-world differences in having anything on separate disks (as per my own advice) is probably miniscule... hence if this whole discussion is ultimately about splitting hairs on disk performance, I guess you are probably right... split the OS and Apps for max performance. FWIW! :shrug:
02-28-2008, 08:56 PM
IanB

Quote:

Originally Posted by VirtualRain

One factor is that most windows code is loaded at boot... most of the DLL's you are considering here are actually used by the OS, not just the apps you will load later, so most of it is already in memory or the page file before you start your app.

Well, yes and no. There are core DLLs that are always memory-resident, as I said. But "all" of Windows isn't speculatively loaded by the OS at boot-up, it's pulled from disk only when required by a loading app. For instance, if you start up a DX graphics game, it will have to load the DX code modules, they won't be in memory until an app that needs them is loaded. That will take quite a significant amount of time, I'd think, since the DX API is very large and the code is no doubt complex and substantial - I'll admit it's not something I've played with or programmed for, though. The DX DLLs are part of the Windows installation, so this is a perfect example of my premise that app code and OS code will have to load simultaneously and therefore should be on separate disks, as games and other DX apps generally have their own external code modules and setup files separate from the actual game EXE.

Quote:

Secondly, as many have found, loading an OS or an application is not simply a disk constrained operation... it's very unlikely that loading any app saturates your HD I/O (unless your HD is a dog) because the code that's loading actually has to execute as well. This is why RAID-0 arrays don't scale well for OS loading time benchmarks, because loading an OS is typically not a disk constrained task. Nor is loading an app. This is different from loading a video file where it simply has to transfer from disk to memory and can easily saturate your disk throughput.

Hmm, I think comparing the OS load time and a single app's load time is a bit meaningless. :ROTF:

Seriously, loading the OS takes a HUGE amount of processor load because it is setting up so many data structures, loading enormous numbers of subcomponents from disk, testing hardware responses and checking driver initialisation, starting services, connecting to the internet and waiting for local gateway and router responses... Of course that's not just a matter of dividing the amount of raw data by the speed it should be streamed over the disk interface to get a time it should load in. ;) Unlike a large data file as you rightly mention...

An app doesn't have to do all that complex and time-consuming hardware background stuff as it is loading. BUT the bigger the app, the more private data structure setup and initialisation it will have to do, and almost certainly the more external DLLs it will have to call on as helpers. So it's a truism that a bigger app (like a game) is obviously going to take longer to load relative to its code size not because it is bigger but because its initial setup is more complex, especially as much of that setup (such as grabbing memory buffers and loading data) will be accomplished through calling the system DLLs, which involves a wait every time the processor switches from module to module, especially if some of those are in protected kernel mode (pretty much all low-level I/O such as disk routines). Of course, when it's loaded, the app generally spends most of its time waiting for user input, then likely spends a small amount of time processing that input, then hands over displaying the result to the OS, which is done through the API and "costs" the same for all apps.

But if the speed of loading an app was NOT mostly diskbound, then there wouldn't be so many people here swearing by putting their games partitions on Raptors, surely? If it was just the speed of processing the incoming code that slowed up the load, then a faster processor, not a faster disk, would surely be the answer? This has to be wrong.

(WARNING: gross order of magnitude guesstimation here!) With processors running at 3GHz, they are capable of processing a few Giga-ops per second on average, as many of the small x86 ops are completed in less than 1 or 2 processor cycles and if the code is written correctly optimised (but don't count on this from most compilers) every core can fetch and process up to four ops in parallel - yes, that's every core, before you even think about the other cores running other threads! :p:

OK, the slowest ops, which can still operate in parallel with others, are around 100 cycles, but they are usually much less frequent in occurrence, so this won't have much of a hit on the headline order of magnitude calculation here. Since x86 opcodes are a couple to 10 bytes in length, you are therefore talking about a single 3GHz core being able to process perhaps 10GBytes of code per second IF it could be fed fast enough and didn't have to switch modes or jump between modules or access memory. This is unrealistic in real terms obviously because the first limitation here is the speed of memory, both for data being pulled into processor instruction cache and for reading/writing processed results into memory, but getting that data into memory from disk in the first place is the real bottleneck. My point is that clearly processing speed is a few orders of magnitude over the speed of the data incoming from disk, and if the app setup code is also reading/writing setup files to/from disk then that is what the app is waiting for, not the code processing.

Consider that average disk seek times are in the order of 10ms - in that time a 3GHz processor using the rough calculation above could have processed perhaps 100,000,000 code ops, or 1GByte of code data (ignoring memory latency issues etc.) - MUCH more than the size of any app out there! But most likely the app initialisation is single-threaded and can't do anything except sit and wait for 10ms for the disk to return the setup data it has requested, or for the OS to confirm that the data it asked to be written to disk has been. That's the way the API works, and why app loading takes so long. Multi-threading is generally only useful once the app has fully initialised and is able to work in parallel on user data already loaded into memory, when there are other I/O APIs that allow a thread to continue processing while the disk I/O takes place and not just stop and wait for the result.

EDIT: On the face of it, therefore, I'll admit this might suggest that it shouldn't matter where the files are located, as if every request requires the loading app to wait there is no real opportunity for parallelism in loading DLLs etc. IF the app is single-threaded and requires them loaded in a particular order. This has certainly just given me pause for thought in my own programming... :cool: I'm not sure how the OS manages the loading of DLLs it finds initially requested by the app, it may well enable some parallel loading by using separate internal threads, although it might just as well not do and simply loads them in sequence. But this is the point - underneath the app the OS kernel IS multithreaded, and any other apps or OS services in memory are doing their own I/O in their own threads and competing for disk access with the loading app, so splitting up the OS which is the central resource for all DLL modules from all apps is still useful. If nothing else it reduces seek times by restricting two heads from moving further than the OS partition and the app's location in a separate partition, instead of one head having to seek between both on the same physical disk, and I think we are agreed that seek time is the worst bottleneck to overcome here, not disk transfer speed.
02-29-2008, 12:23 AM
virtualrain

Hey IanB, I read most of your article, but admit got tired around the processor math. (sorry!)

I think I agree with most of what you say "in theory".

Consider though, that if you put your OS and Apps on a multi-disk RAID-0 or RAID-10 array together, you are forcing parallel reads regardless of whether the software fetching the data is single or multi-threaded. Theoretically, the RAID-10 is also providing an optimized read from the mirrored disks as well.

The bottom line though in the real world is that it's been proven time and time again, that if you load your game from say a Raptor and it takes 20 seconds to load, and then you put it on a dual disk RAID-0 array, it might load in 18 or 19 seconds. Theoretically, it should load in 10 seconds... this more than anything leads me to conclude that loading an app or game is not a disk bound exercise... not even close.
02-29-2008, 01:14 AM
IanB

Quote:

Originally Posted by virtualrain

The bottom line though in the real world is that it's been proven time and time again, that if you load your game from say a Raptor and it takes 20 seconds to load, and then you put it on a dual disk RAID-0 array, it might load in 18 or 19 seconds. Theoretically, it should load in 10 seconds... this more than anything leads me to conclude that loading an app or game is not a disk bound exercise... not even close.

But haven't you just proved my point? The benefit of RAID 0, AFAIK, is to increase the data streaming speed of the array. It has no benefit on access times. The drive heads still have to seek exactly the same, you just have two drives seeking instead of one.

My premise is that there is much more to app loading than merely reading in a stream of code data (like a data file, which would benefit from your RAID 0 setup), the app then initialises by loading many more components, the bigger the app the more of them. That is limited by the seek time of the drives those components are loading from, not by the streaming speed. If your app is located on the same disk but a different area from the OS, then the drive heads are seeking between those areas all the time while the app initialises. RAID 0 can't help there, so separating the two into different drives at least means seeks on two drives are limited to a smaller physical disk area. That must reduce seek times and therefore overall load times significantly.
02-29-2008, 10:12 AM
virtualrain

I see what you are saying now... I agree... if seek times are the major factor affecting load times then SSD's will be a great benefit. Do you have any reviews handy that show the improvement in app loading times with SSD?

It would be interesting to see the difference between having OS and APPS on different raptors vs. having them on the same RAID-0 array.

Show 100 post(s) from this thread on one page